Xuxiang ZhangThis email address is being protected from spambots. You need JavaScript enabled to view it.

School of Finance and Economics, Zhengzhou University of Science and Technology, Zhengzhou, China


 

Received: June 3, 2025
Accepted: December 1, 2025
Publication Date: December 27, 2025

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.


Download Citation: ||https://doi.org/10.6180/jase.202607_30.021  


This paper presents a multimodal large-model pipeline for constructing a cultural-tourism integration knowl edge graph (CTI-KG). By harmonizing text, imagery, audio and geospatial data, the pipeline automatically extracts and links cultural entities, tourism services and intangible heritage, fusing them into a unified semantic layer. A cross-modal alignment module transfers shared representations among vision, language and geo graphic signals, enabling coherent knowledge fusion without manual pairing. The resulting graph supports multiple smart tourism applications, including personalized route planning, immersive storytelling and dy namic resource recommendation. Extensive evaluations confirm that the new approach improves semantic richness and user engagement while remaining generalizable to other heritage domains. The framework offers an open, reusable methodology for constructing multimodal knowledge graphs at scale.


Keywords: Multimodal large-model, cultural-tourism integration knowledge graph, cross-modal alignment module, knowledge fusion


  1. [1] Q. Zhang, Z. Ma, P. Zhang, and E. Jenelius, (2025) “Mobility knowledge graph: review and its application in public transport"Transportation52(3):1119–1145. DOI:10.1007/s11116-023-10451-8.
  2. [2] S.Tsaneva,D.Dessı,F.Osborne,andM.Sabou,(2025) “Knowledge graph validation by integrating LLMs and human-in-the-loop "Information Processing & Management 62(5):104145.DOI:10.1016/j.ipm.2025.104145.
  3. [3] X. Sha, J. Wang, X. Xu,and J. Ding, “Interdependent path Recurrent Embedding For Knowledge Graph-aware Recommendation "Journal of Applied Science and Engineering29(3):531–543.DOI:10.6180/jase.202603_29(3).0004.
  4. [4] S. Yin and A. A. Laghari, (2024)“Multi-branch Collaboration Based Person Re-identification "IFS/ACM Transactions on Machine Learning1(1):19–24.DOI: 10.70891/JSE.2024.100013.
  5. [5] C.Zhao,G.So,andR.Chen, “Knowledge Graph Representation Learning Model Based On Capsule Network And Information Fusion "Journal of Applied Science  and Engineering29(1):89–101.DOI:10.6180/jase.202601_29(1).0009.
  6. [6] Z. Deng, W. Ma, Q. -L. Han, W. Zhou, X. Zhu, S. Wen, and Y. Xiang, (2025)“ExploringDeepSeek:A Survey on Advances, Applications, Challenges and Future Directions" IEEE/CAA Journal of Automatica Sinica 12(5): 872–893. DOI: 10.1109/JAS.2025.125498.
  7. [7] D. M. Katz, M. J. Bommarito, S. Gao, and P. Arredondo, (2024) “Gpt-4 passes the bar exam" Philosophical Transactions of the Royal Society A 382(2270): 20230254. DOI: 10.1098/rsta.2023.0254.
  8. [8] Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang, et al., (2024) “A survey on evaluation of large language models" ACM transactions on intelligent systems and technology 15(3): 1–45. DOI: 10.1145/3641289.
  9. [9] K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, et al., (2023) “Large language models encode clinical knowledge" Nature 620(7972): 172–180. DOI: 10.1038/s41586-023-06291-2.
  10. [10] J. G. Meyer, R. J. Urbanowicz, P. C. Martin, K. O’Connor, R. Li, P.-C. Peng, T. J. Bright, N. Tatonetti, K. J. Won, G. Gonzalez-Hernandez, et al., (2023) “ChatGPT and large language models in academia: op portunities and challenges" Bio Data mining 16(1): 20. DOI: 10.1186/s13040-023-00339-9.
  11. [11] G. Tong, D. Li, and X. Liu, (2024) “An improved model combining knowledge graph and GCN for PLM knowledge recommendation" Soft Computing 28(6): 5557–5575. DOI: 10.1007/s00500-023-09340-0.
  12. [12] X. Gu, M. Chen, Y. Lin, Y. Hu, H. Zhang, C. Wan, Z. Wei, Y. Xu, and J. Wang, (2025) “On the effectiveness of large language models in domain-specific code generation" ACM Transactions on Software Engineering and Methodology 34(3): 1–22. DOI: 10.1145/3697012.
  13. [13] A. Nakhaee, D. Elshani, and T. Wortmann. “A vision for automated building code compliance check ing by unifying hybrid knowledge graphs and large language models”. In: Design Modelling Symposium Berlin. Springer. 2024, 445–457. DOI: 10.1007/978-3-031-68275-9_36.
  14. [14] E. Bugliarello, R. Cotterell, N. Okazaki, and D. El liott, (2021) “Multimodal pretraining unmasked: A meta analysis and a unified framework of vision-and-language BERTs" Transactions of the Association for Computational Linguistics 9: 978–994. DOI: 10.1162/tacl_a_00408.
  15. [15] J. Yu, L. Zhao, S. Yin, and M. Ivanovi´c, (2024) “News recommendation model based on encoder graph neural net work and bat optimization in online social multimedia art education" Computer Science and Information Systems 21(3): 989–1012. DOI: 10.2298/CSIS231225025Y.
  16. [16] L. Chen, C. Huang, X. Zheng, J. Lin, and X.-J. Huang. “Table VLM: multi-modal pre-training for table structure recognition”. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023, 2437–2449. DOI: 10.18653/v1/2023.acl-long.137.
  17. [17] Z. Hu, X. Li, X. Pan, S. Wen, and J. Bao, (2025) “A question answering system for assembly process of wind turbines based on multi-modal knowledge graph and large language model" Journal of engineering design 36(7 9): 1093–1117. DOI: 10.1080/09544828.2023.2272555.
  18. [18] K. Ahrabian, X. Du, R. D. Myloth, A. B. S. Anan than, and J. Pujara, (2023) “PubGraph: A Large Scale Scientific Knowledge Graph" arXiv preprint arXiv:2302.02231: DOI: 10.48550/arXiv.2302.02231.
  19. [19] S. Marchesin, G. Silvello, and O. Alonso. “Utility Oriented Knowledge Graph Accuracy Estimation with Limited Annotations: A Case Study on DBpe dia”. In: Proceedings of the AAAI Conference on Hu man Computation and Crowdsourcing. 12. 2024, 105 114. DOI: 10.1609/hcomp.v12i1.31605.
  20. [20] G. Sharma, V. Tripathi, and V. Saingh, (2023) “An efficient development framework for the generation of a local knowledge graph" Recent Advances in Sciences, Engineering, Information Technology & Management 2782(1): 020092. DOI: 10.1063/5.0154305.
  21. [21] W. Li, Y. Guo, B. Wang, and B. Yang, (2023) “Learning spatiotemporal embedding with gated convolutional recurrent networks for translation initiation site prediction" Pattern Recognition 136: 109234. DOI: 10.1016/j.patcog.2022.109234.
  22. [22] T. Fan, H. Wang, and T. Hodel, (2023) “CICHMKG: a large-scale and comprehensive Chinese intangible cultural heritage multimodal knowledge graph" Heritage Science 11(1): 115. DOI: 10.1186/s40494-023-00927-2.
  23. [23] G. Zhang, H. Li, S. Li, B. Wang, and Z. Ding, (2024) “MMKG-PAR: multi-modal knowledge graphs-based personalized attraction recommendation" Sustainability 16(5): 2211. DOI: 10.3390/su16052211.
  24. [24] J. Zhang, J. C. F. Chan, Z. Zhao, and J. C. Cheng. “Heritage Building Information Management and Intelligent Querying by Multimodal Large Language Models and Knowledge Graph”. In: Proceedings of The Sixth International Confer. 22. 2025, 570–577. DOI: 10.29007/t5sx.