Journal of Applied Science and Engineering

Published by Tamkang University Press

1.30

Impact Factor

2.10

CiteScore

G. Deena This email address is being protected from spambots. You need JavaScript enabled to view it.1,2 and K. Raja2

1Sathyabama Institute of Science and Technology, Rajiv Gandhi Salai, Chennai, 600119, India
2Department of Computer Science and Engineering, SRM Institute of Science and Technology Bharathi Salai, Chennai, 600089, India


 

Received: December 10, 2021
Accepted: May 28, 2022
Publication Date: July 20, 2022

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.


Download Citation: ||https://doi.org/10.6180/jase.202304_26(4).0006  


ABSTRACT


In-Text Mining, Information Retrieval (IR), and Natural Language Processing (NLP) dig out the important text or word from an unstructured document is coined by the technique called Keyword extraction. It helps to identify the core information about the document in specific. Instead of going through the entire document, this method helps to retrieve sufficient information instantly in a short span of time. It is essential to mine the meaningful word from the document in text analytics. The proposed system has been based on semantic relation to extracts the keyword from unstructured text documents by means of practice like Latent Semantic Analysis (LSA). In view of this method, there exists a semantic relation between the sentences available in the document and the words. Extracted text permits to signify text in a strong way and has a high preference to carry more important information about the sentences. In this regard, LSA has produced better outcomes when compared with the TF-IDF, RAKE, YAKE, and Text Rank algorithm. Consequently, the keyword extraction has been occupied in Automatic Question Generation (ACQ) to generate the Fill up the blank (FB) and Multiple Choice Questions (MCQ) with distractor set. The top five, ten keywords are involved in questionable generation. The proposed system could be implemented in the question generation system to assess the skill level of the learner.


Keywords: Natural Language Processing, Latent Semantic Analysis, Multiple Choice Questions, Keyword Extraction, Semantic relation


REFERENCES


  1. [1] S. Menaka and N. Radha, (2013) “Text classification using keyword extraction technique" International Journal of Advanced Research in Computer Science and Software Engineering 3(12): 734–740.
  2. [2] J. Zhao, Q. Zhu, G. Zhou, and L. Zhang, (2017) “Review of research in automatic keyword extraction" Journal of software 28(9): 2431–2449. DOI: 10.13328/j.cnki.jos.005301.
  3. [3] D. B. Bracewell, F. Ren, and S. Kuriowa. “Multilingual single document keyword extraction for information retrieval”. In: 2005 International Conference on Natural Language Processing and Knowledge Engineering. IEEE. 2005, 517–522. DOI: 10.1109/NLPKE.2005.1598792.
  4. [4] G. Deena and K. Raja. “A study on knowledge based e-learning in teaching learning process”. In: 2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET). IEEE. 2017, 1–6. DOI: 10.1109/ICAMMAET.2017.8186686.
  5. [5] G. Deena and K. Raja, (2018) “The Impact of Learning Style to Enrich the Performance of Learner in E-Learning System" Journal of Web Engineering 17(6): 3407–3421.
  6. [6] A. Hulth. “Improved automatic keyword extraction given more linguistic knowledge”. In: Proceedings of the 2003 conference on Empirical methods in natural language processing. 2003, 216–223.
  7. [7] A. Hulth. “Combining machine learning and natural language processing for automatic keyword extraction". (phdthesis). Institutionen för data-och systemvetenskap (tills m KTH), 2004.
  8. [8] M. Litvak and M. Last. “Graph-based keyword extraction for single-document summarization”. In: Coling 2008: Proceedings of the workshop multi-source multilingual information extraction and summarization. 2008, 17–24.
  9. [9] L. Yao, Z. Pengzhou, and Z. Chi. “Research on news keyword extraction technology based on TF-IDF and TextRank”. In: 2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS). IEEE. 2019, 452–455. DOI: 10.1109/ICIS46139.2019.8940293.
  10. [10] L. Qifei and S.Weiyu, (2018) “Research of keyword extraction of political news based on word2vec and textrank" Information Research 6: 22–27.
  11. [11] G. K. Palshikar. “Keyword extraction from a single document using centrality measures”. In: International conference on pattern recognition and machine intelligence. Springer. 2007, 503–510. DOI: 10.1007/978-3-540-77046-6_62.
  12. [12] M. A. Andrade and A. Valencia, (1998) “Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families." Bioinformatics (Oxford, England) 14(7): 600–607. DOI: 10.1093/bioinformatics/14.7.600.
  13. [13] G. Salton. Automatic text processing. Addison-Wesley Longman Publishing Co., 1989.
  14. [14] C. Zhang, (2008) “Automatic keyword extraction from documents using conditional random fields" Journal of Computational Information Systems 4(3): 1169–1180.
  15. [15] H. P. Edmundson, (1969) “New methods in automatic extracting" Journal of the ACM (JACM) 16(2): 264–285. DOI: 10.1145/321510.321519.
  16. [16] Z. Li, D. Zhou, Y.-F. Juan, and J. Han. “Keyword extraction for social snippets”. In: Proceedings of the 19th international conference on World wide web. 2010, 1143–1144.
  17. [17] Y. Matsuo and M. Ishizuka, (2004) “Keyword extraction from a single document using word co-occurrence statistical information" International Journal on Artificial Intelligence Tools 13(01): 157–169.
  18. [18] F. Liu, D. Pennell, F. Liu, and Y. Liu. “Unsupervised approaches for automatic keyword extraction using meeting transcripts”. In: Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics. 2009, 620–628. DOI: 10.3115/1620754.1620845.
  19. [19] S. Rose, D. Engel, N. Cramer, andW. Cowley, (2010) “Automatic keyword extraction from individual documents" Text mining: applications and theory 1: 1–20. DOI: 10.1002/9780470689646.ch1.
  20. [20] R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt, (2020) “YAKE! Keyword extraction from single documents using multiple local features" Information Sciences 509: 257–289. DOI: 10.1016/j.ins.2019.09.013.
  21. [21] R. Mihalcea and P. Tarau. “Textrank: Bringing order into text”. In: Proceedings of the 2004 conference on empirical methods in natural language processing. 2004, 404–411.
  22. [22] X.Wan and J. Xiao. “Single document keyphrase extraction using neighborhood knowledge.” In: AAAI. 8. 2008, 855–860.
  23. [23] C. Florescu and C. Caragea. “Positionrank: An unsupervised approach to keyphrase extraction from scholarly documents”. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017, 1105–1115. DOI: 10.18653/v1/P17-1102.
  24. [24] R. Mihalcea, P. Tarau, and E. Figa. “PageRank on semantic networks, with application to word sense disambiguation”. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics. 2004, 1126–1132.
  25. [25] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Tech. rep. Stanford InfoLab, 1999.
  26. [26] S. Rose, D. Engel, N. Cramer, andW. Cowley, (2010) “Automatic keyword extraction from individual documents" Text mining: applications and theory 1: 1–20. DOI: 10.1002/9780470689646.ch1.
  27. [27] T. Pay, S. Lucci, and J. L. Cox, (2019) “An Ensemble of Automatic Keyword Extractors: TextRank, RAKE and TAKE" Computación y Sistemas 23(3): 703–710. DOI:10.13053/CyS-23-3-3234.
  28. [28] R. Campos, V. Mangaravite, A. Pasquali, A. M. Jorge, C. Nunes, and A. Jatowt. “Yake! collection independent automatic keyword extractor”. In: European Conference on Information Retrieval. Springer. 2018, 806–810. DOI: 10.1007/978-3-319-76941-7_80.
  29. [29] T. K. Landauer, P. W. Foltz, and D. Laham, (1998) “An introduction to latent semantic analysis" Discourse processes 25(2-3): 259–284.
  30. [30] Y. Matsuo and M. Ishizuka, (2004) “Keyword extraction from a single document using word co-occurrence statistical information" International Journal on Artificial Intelligence Tools 13(01): 157–169.
  31. [31] J. Gao and J. Zhang, (2005) “Clustered SVD strategies in latent semantic indexing" Information processing & management 41(5): 1051–1063. DOI: 10.1016/j.ipm.2004.10.005.
  32. [32] G. Deena and K. Raja, (2019) “Sentence selection using latent semantic analysis for automatic question generation in e-learning system" International Journal of Innovative Technology and Exploring Engineering 8(9): 86–91.
  33. [33] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, (2016) “Squad: 100,000+ questions for machine comprehension of text" arXiv preprint arXiv:1606.05250: DOI: 10.18653/v1/d16-1264.
  34. [34] G. Deena and K. Raja, (2019) “Designing an Automated Intelligent e-Learning system to enhance the knowledge using machine learning techniques" International journal of advanced computer science and applications 10(12): DOI: 10.14569/ijacsa.2019.0101215.
  35. [35] G. Deena and K. Raja, (2019) “Designing an Automated Intelligent e-Learning system to enhance the knowledge using machine learning techniques" International journal of advanced computer science and applications 10(12): DOI: 10.14569/ijacsa.2019.0101215.
  36. [36] G. Deena, K. Raja, N. B. PK, and K. Kannan, (2020) “Developing the assessment questions automatically to determine the cognitive level of the E-learner using NLP techniques" International Journal of Service Science, Management, Engineering, and Technology (IJSSMET) 11(2): 95–110. DOI: 10 . 4018/IJSSMET.2020040106.
  37. [37] G. Deena and K. Raja, (2022) “Objective Type Question Generation using Natural Language Processing" International Journal of Advanced Computer Science and Applications 13(2): DOI: 10.14569/IJACSA.2022.0130263.


Latest Articles