Few-shot English text classification method based on graph convolutional network and prompt learning

Yunfei  Jin

doi:10.6180/jase.202509_28(9).0013

Few-shot English text classification method based on graph convolutional network and prompt learning

Research Categories

Yunfei JinThis email address is being protected from spambots. You need JavaScript enabled to view it.

School of Foreign Languages, Zhengzhou University of Science and Technology Zhengzhou 450064 China

Received: October 21, 2024
Accepted: November 18, 2024
Publication Date: December 28, 2024

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202509_28(9).0013

The classification method based on pre-training fine-tuning usually requires a large amount of labeled data, which makes it impossible to apply to few-shot classification tasks. Therefore, this paper proposes a novel few-shot English text classification method based on graph neural network and prompt learning. The text level graph convolutional network is used to construct a graph for each input text and share global parameters, and the result of the text graph neural network is used as the input of the prototype network. This new method generates a class representation vector rich in class semantic information for each class. On the other hand, a manual prompt template is used to obtain a class prediction semantic vector for the [MASK] position. In the process of classification prediction, the similarity between class prediction semantic vector and class representation vector is used as classification basis. Compared with the traditional method of using linear layer for final answer mapping and the method of using custom class representation word set for classification prediction, this new method alleviates the semantic loss in the process of answer mapping. Through random sampling on the three data sets of THUCNews, SHNews and Toutiao, a few-shot training set and a verification set are formed for the experiment. The experimental results show that the proposed method has improved the overall performance of the 1-shot, 5-shot, 10-shot and 20-shot tasks on the above dataset, especially the 1-shot task. Compared with the baseline few-shot text classification method, The accuracy is improved by 7.59%, 2.11% and 3.10% respectively, which verifies the effectiveness of the proposed method in few-shot English text classification.

Keywords: Few-shot English text classification, pre-training model, graph convolutional network, prompt learning, semantic information

[1] S. Sharmin and Z. Zaman. “Spam detection in social media employing machine learning tool for text min ing”. In: 2017 13th international conference on signal image technology & internet-based systems (SITIS). IEEE. 2017, 137–142. DOI: 10.1109/SITIS.2017.32.
[2] C.He, Y. Hu, A. Zhou, Z. Tan, C. Zhang, and B. Ge. “Awebnewsclassification method: Fusion noise fil tering and Convolutional Neural Network”. In: Pro ceedings of the 2020 2nd Symposium on Signal Processing Systems. 2020, 80–85. DOI: 10.1145/3421515.342152.
[3] M. Labani, P. Moradi, F. Ahmadizar, and M. Jalili, (2018) “A novel multivariate filter method for feature selection in text classification problems" Engineering Applications of Artificial Intelligence 70: 25–37. DOI: 10.1016/j.engappai.2017.12.014.
[4] J. Yu, Z. Lu, S. Yin, and M. Ivanovi´c, (2024) “News recommendation model based on encoder graph neural net work and bat optimization in online social multimedia art education" Computer Science and Information Sys tems 21(3): 989–1012. DOI: 10.2298/CSIS231225025Y.
[5] P. Dhal and C. Azad, (2023) “A lightweight filter based feature selection approach for multi-label text classifica tion" Journal of Ambient Intelligence and Human ized Computing 14(9): 12345–12357. DOI: 10.1007/s12652-022-04335-5.
[6] J. Snell, K. Swersky, and R. Zemel, (2017) “Prototypi cal networks for few-shot learning" Advances in neural information processing systems 30: 4080–4090. DOI: 10.5555/3294996.3295163.
[7] W.Li, T. Ren, F. Li, J. Zhang, and Z. Wu, (2021) “Con textual similarity-based multi-level second-order attention network for semi-supervised few-shot learning" Neuro computing 461: 336–349. DOI: 10.1016/j.neucom.2021.07.062.
[8] A.Salehi, A. Coninx, and S. Doncieux, (2022) “Few shot quality-diversity optimization" IEEE Robotics and Automation Letters 7(2): 4424–4431. DOI: 10.1109/LRA.2022.3148438.
[9] S. Yin, H. Li, A. A. Laghari, T. R. Gadekallu, G. A. Sampedro, and A. Almadhor, (2024) “An anomaly detection model based on deep auto-encoder and capsule graph convolution via sparrow search algorithm in 6G internet-of-everything" IEEE Internet of Things Jour nal 11(18): DOI: 10.1109/JIOT.2024.3353337.
[10] J. X. Wang, (2021) “Meta-learning in natural and artifi cial intelligence" Current Opinion in Behavioral Sci ences 38: 90–95. DOI: 10.1016/j.cobeha.2021.01.002.
[11] L. Wang, Y. Shoulin, H. Alyami, A. A. Laghari, M. Rashid, J. Almotiri, H. J. Alyamani, and F. Alturise. Anovel deep learning-based single shot multibox detector model for object detection in optical remote sensing images. 2024. DOI: 10.1002/gdj3.162.
[12] Y.Jiang and S. Yin, (2023) “Heterogenous-view occluded expression data recognition based on cycle-consistent ad versarial network and K-SVD dictionary learning under intelligent cooperative robot environment" Computer Science and Information Systems 20(4): 1869–1883. DOI: 10.2298/CSIS221228034J.
[13] P. Xu, L. Xiao, B. Liu, S. Lu, L. Jing, and J. Yu. “Label specific feature augmentation for long-tailed multi label text classification”. In: Proceedings of the AAAI conference on artificial intelligence. 37. 9. 2023, 10602 10610. DOI: 10.1609/aaai.v37i9.26259.
[14] M.ZhangandJ. Li, (2021) “A commentary of GPT-3 in MIT Technology Review 2021" Fundamental Research 1(6): 831–833. DOI: 10.1016/j.fmre.2021.11.011.
[15] M. U. Khattak, H. Rasheed, M. Maaz, S. Khan, and F. S. Khan. “Maple: Multi-modal prompt learning”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 19113–19122. DOI: 10.1109/CVPR52729.2023.01832.
[16] Y.Liu, Y. Lu, H. Liu, Y. An, Z. Xu, Z. Yao, B. Zhang, Z. Xiong, and C. Gui. “Hierarchical prompt learning for multi-task learning”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 10888–10898. DOI: 10.1109/CVPR52729.2023.01048.
[17] T. Huang, J. Chu, and F. Wei, (2022) “Unsuper vised prompt learning for vision-language models" arXiv preprint arXiv:2204.03649:
[18] Y. Zhu, Y. Wang, J. Qiang, and X. Wu, (2023) “Prompt learning for short text classification" IEEE Transactions on Knowledge and Data Engineering:
[19] X. Yang, M. Yan, S. Pan, X. Ye, and D. Fan. “Simple and efficient heterogeneous graph neural network”. In: Proceedings of the AAAI conference on artificial intel ligence. 37. 9. 2023, 10816–10824. DOI: 10.1609/aaai.v37i9.26283.
[20] S.Yin,L.Wang,M.Shafiq,L.Teng,A.A.Laghari,and M. F. Khan, (2023) “G2Grad-CAMRL: an object detec tion and interpretation model based on gradient-weighted class activation mapping and reinforcement learning in remote sensing images" IEEE Journal of Selected Top ics in Applied Earth Observations and Remote Sensing 16: 3583–3598. DOI: 10.1109/JSTARS.2023.3241405.
[21] Y.Wang,C.Wang,J.Zhan,W.Ma,andY.Jiang,(2023) “Text FCG: Fusing contextual information via graph learn ing for text classification" Expert Systems with Appli cations 219: 119658. DOI: 10.1016/j.eswa.2023.119658.
[22] A. Onan, (2023) “Hierarchical graph-based text clas sification framework with contextual node embedding and BERT-based dynamic fusion" Journal of king saud university-computer and information sciences 35(7): 101610. DOI: 10.1016/j.jksuci.2023.101610.
[23] Y. Liu, L. Huang, F. Giunchiglia, X. Feng, and R. Guan. “Improved graph contrastive learning for short text classification”. In: Proceedings of the AAAI Conference on Artificial Intelligence. 38. 17. 2024, 18716 18724. DOI: 10.1609/aaai.v38i17.29835.
[24] G. Dou, K. Zhao, M. Guo, and J. Mou, (2023) “Memristor-based LSTM network for text classifica tion" Fractals 31(06): 2340040. DOI: 10.1142/S0218348X23400406.