Transfer Learning and Concept Network-based Joint Prediction Model for English Words and Their Capitalization in Neural Machine Translation

Xianghua  Wu

doi:10.6180/jase.202605_29(5).0024

Transfer Learning and Concept Network-based Joint Prediction Model for English Words and Their Capitalization in Neural Machine Translation

Research Categories

Xianghua WuThis email address is being protected from spambots. You need JavaScript enabled to view it.

School of Foreign Languages, Zhengzhou University of Science and Technology, Zhengzhou 450064 China

Received: August 4, 2025
Accepted: September 18, 2025
Publication Date: October 18, 2025

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202605_29(5).0024

This paper addresses the issue of consistency and accuracy degradation in neural machine translation due to English case sensitivity, proposing a word-case joint prediction model based on transfer learning and concept networks. Firstly, a bidirectional Transformer encoder is pre-trained on large-scale monolingual corpora, capturing the coupled distribution of word form and case through masked language modeling. Subsequently, a cross-lingual concept network is constructed, aligning the abstract concept nodes of the source language with the English word form and case patterns, achieving knowledge transfer. In the translation stage, a joint decoder is introduced to simultaneously predict word elements and their case labels, and maintain cross-sentence consistency with the constraints of the concept network. Experimental results on WMT14 English-German, English-French datasets show that the model improves BLEU by 1.8 compared to the baseline, increases case accuracy by 6.3%, and significantly reduces proper nouns and sentence-initial errors. Abandonment analysis confirms that transfer learning and concept constraints are complementary, providing a new idea for precise case control in low-resource scenarios.

Keywords: Transfer learning, Concept network, Joint prediction, Neural machine translation

[1] A. Yagahara, M. Uesugi, and H. Yokoi, (2025) “Exploration of the optimal deep learning model for english- japanese machine translation of medical device adverse event terminology” BMC Medical Informatics and Decision Making 25(1): 66. DOI: 10.1186/s12911-025-02912-0.
[2] T. J. Singh, S. R. Singh, and P. Sarmah, (2025) “Distilling Knowledge in Machine Translation of Agglutinative Languages with Backward and Morphological Decoders” ACM Transactions on Asian and Low-Resource Language Information Processing 24(1): 1–19. DOI: 10.1145/3703455.
[3] R. Appicharla, K. K. Gupta, A. Ekbal, and P. Bhattacharyya, (2025) “Improving Neural Machine Translation Through Code-Mixed Data Augmentation” Computational Intelligence 41(2): e70033. DOI: 10.1111/coin.70033.
[4] J. Yu, L. Zhao, S. Yin, and M. Ivanović, (2024) “News recommendation model based on encoder graph neural network and bat optimization in online social multimedia art education” Computer Science and Information Systems 21(3): 989–1012. DOI: 10.2298/CSIS231225025Y.
[5] M. Kamezaki, Y. Kokudo, Y. Uehara, S. Itano, T. Iida, and S. Sugano, (2025) “Predictive Energy Stability Margin: Prediction of Heavy Machine Overturning Considering Rotation and Translation” IEEE Robotics and Automation Letters: DOI: 10.1109/LRA.2025.3540382.
[6] T. Moape, T. S. Mohale, and C. Bester, (2025) “A Culture-Aware Bidirectional IsiXhosa-English Neural Machine Translation Model Using Marian MT” The Indonesian Journal of Computer Science 14(3): DOI: 10.33022/ijcs.v14i3.4714.
[7] K. Li and D. Sun, (2025) “A Global-Features and Local- Features-Jointly Fused Deep Semantic Learning Framework for Error Detection of Machine Translation” Journal of Circuits, Systems and Computers 34(01): 2550025. DOI: 10.1142/S0218126625500252.
[8] Y. Zhao, H. Li, and S. Yin, (2022) “A Multi channel Character Relationship Classification Model Based on Attention Mechanism" Int. J. Math. Sci. Com put.(IJMSC) 8(1): 28–36. DOI: 10.5815/ijmsc.2022.01.03.
[9] M. Yang and F. Li, (2025) “Improving Machine Translation Formality with Large Language Models." Computers, Materials & Continua 82(2): DOI: 10.32604/cmc.2024.058248.
[10] Q.Fang, “Terminology Alignment Based On Multi-level Feature Fusion For Japanese Scientific And Technological Literature Terminology Translation" Journal of Applied Science and Engineering 29(2): 465–473. DOI: 10.6180/jase.202602_29(2).0021.
[11] M. Ugas, M. A. Calamia, J. Tan, B. Umakanthan, C. Hill, K. Tse, A. Cashell, Z. Muraj, M. Giuliani, and J. Papadakos, (2025) “Evaluating the feasibility and utility of machine translation for patient education materials written in plain language to increase accessibility for populations with limited english proficiency" Patient Education and Counseling 131: 108560. DOI: 10.1016/j.pec.2024.108560.
[12] X. Chen, S. Duan, and G. Liu, (2025) “Improving semi autoregressive machine translation with the guidance of syntactic dependency parsing structure" Neurocomputing 614: 128828. DOI: 10.1016/j.neucom.2024.128828.
[13] M. V. Abuín and M. Garcia. “WordNet Expansion with Bilingual Word Embeddings and Neural Ma chine Translation”. In: EPIA Conference on Artificial In telligence. Springer. 2024, 280–291. DOI: 10.1007/978-3-031-73503-5_23.
[14] D. Rathod, A. K. Yadav, M. Kumar, and D. Yadav, (2025) “Character-Level Encoding Based Neural Machine Translation for Hindi Language" Neural Processing Letters 57(2): 23. DOI: 10.1007/s11063-025-11718-0.
[15] W. Wang, W. Jiao, J.-t. Huang, Z. Tu, and M. Lyu, (2025) “On the shortcut learning in multilingual neural machine translation" Neurocomputing 615: 128833. DOI: 10.1016/j.neucom.2024.128833.
[16] Z. Lan,J. Yu, S. Liu, J. Yao, D. Huang, and J. Su, (2025) “Towards better text image machine translation with multimodal codebook and multi-stage training" Neural Net works: 107599. DOI: 10.1016/j.neunet.2025.107599.
[17] P. Lu and F. Xu, (2025) “The quality optimization of English–Chinese machine translation based on deep neural networks" Discover Artificial Intelligence 5(1): 88. DOI: 10.1007/s44163-025-00319-4.
[18] Z. Yun et al., (2012) “A Chinese-English patent machine translation system based on the theory of hierarchical net work of concepts" The Journal of China Universities of Posts and Telecommunications 19: 140–146. DOI: 10.1016/S1005-8885(11)60430-5.
[19] W. Xiong and Y. Jin, (2013) “Semantic MMT model based on hierarchical network of concepts in Chinese English MT" Journal of Networks 8(1): 237. DOI: 10.4304/jnw.8.1.237-244.
[20] R.Zhou,Q.Shen,andH.Kong,(2025)“Astudyoftext classification algorithms for live-streaming e-commerce comments based on improved BERT model" PloS one 20(4): e0316550. DOI: 10.1371/journal.pone.0316550.
[21] D. Du, Y. Li, Y. Cao, Y. Liu, G. Meng, N. Li, D. Han, andH.Feng.“FAF-BM:An Approach for False Alerts Filtering Using BERT Model with Semi-supervised Active Learning”. In: International Conference on Science of Cyber Security. Springer. 2024, 295–312. DOI: 10.1007/978-981-96-2417-1_16.
[22] Z. Ouyang, Q. Xu, T. Zhang, H. Yi, N. Zhang, M. Xiao, and C. Ju, (2025) “Coupling analysis of disaster causing factors in coal mines and dual prevention mechanism based on the Key BERT model and accident-causation theory model" Frontiers in Earth Science 13: 1586785. DOI: 10.3389/feart.2025.1586785.
[23] M. S. Ghafourian, S. Tarkiani, M. Ghajar, M. Chavooshi, H. Khormaei, and A. Ramezani, (2025) “Optimizing warfarin dosing in diabetic patients through BERT model and machine learning techniques" Computers in Biology and Medicine 186: 109755. DOI: 10.1016/j.compbiomed.2025.109755.
[24] H. Duan, X. Gao, and Y. Zhang, (2025) “The Application of AI Translation Tools in Improving Students’ Translation Fidelity and Accuracy" Arab World English Journal: DOI: 10.24093/awej/AI.16.
[25] F. Z. El Idrysy, S. Hourri, I. El Miqdadi, A. Hayati, Y. Namir, B. Ncir, and J. Kharroubi, (2025) “Unlocking the language barrier: A Journey through Arabic machine translation" Multimedia Tools and Applica tions 84(14): 14071–14104. DOI: 10.1007/s11042-024.19551-8.

Latest Articles