{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T21:22:02Z","timestamp":1777670522401,"version":"3.51.4"},"reference-count":40,"publisher":"PeerJ","license":[{"start":{"date-parts":[[2022,5,30]],"date-time":"2022-05-30T00:00:00Z","timestamp":1653868800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>A document\u2019s keywords provide high-level descriptions of the content that summarize the document\u2019s central themes, concepts, ideas, or arguments. These descriptive phrases make it easier for algorithms to find relevant information quickly and efficiently. It plays a vital role in document processing, such as indexing, classification, clustering, and summarization. Traditional keyword extraction approaches rely on statistical distributions of key terms in a document for the most part. According to contemporary technological breakthroughs, contextual information is critical in deciding the semantics of the work at hand. Similarly, context-based features may be beneficial in the job of keyword extraction. For example, simply indicating the previous or next word of the phrase of interest might be used to describe the context of a phrase. This research presents several experiments to validate that context-based key extraction is significant compared to traditional methods. Additionally, the KeyBERT proposed methodology also results in improved results. The proposed work relies on identifying a group of important words or phrases from the document\u2019s content that can reflect the authors\u2019 main ideas, concepts, or arguments. It also uses contextual word embedding to extract keywords. Finally, the findings are compared to those obtained using older approaches such as Text Rank, Rake, Gensim, Yake, and TF-IDF. The Journals of Universal Computer (JUCS) dataset was employed in our research. Only data from abstracts were used to produce keywords for the research article, and the KeyBERT model outperformed traditional approaches in producing similar keywords to the authors\u2019 provided keywords. The average similarity of our approach with author-assigned keywords is 51%.<\/jats:p>","DOI":"10.7717\/peerj-cs.967","type":"journal-article","created":{"date-parts":[[2022,5,30]],"date-time":"2022-05-30T07:23:12Z","timestamp":1653895392000},"page":"e967","source":"Crossref","is-referenced-by-count":53,"title":["Impact analysis of keyword extraction using contextual word embedding"],"prefix":"10.7717","volume":"8","author":[{"given":"Muhammad Qasim","family":"Khan","sequence":"first","affiliation":[{"name":"Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, Pakistan"}]},{"given":"Abdul","family":"Shahid","sequence":"additional","affiliation":[{"name":"Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, Pakistan"}]},{"given":"M. Irfan","family":"Uddin","sequence":"additional","affiliation":[{"name":"Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, Pakistan"}]},{"given":"Muhammad","family":"Roman","sequence":"additional","affiliation":[{"name":"Institute of Computing, Kohat University of Science & Technology, Kohat, Kohat, Pakistan"}]},{"given":"Abdullah","family":"Alharbi","sequence":"additional","affiliation":[{"name":"Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia"}]},{"given":"Wael","family":"Alosaimi","sequence":"additional","affiliation":[{"name":"Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia"}]},{"given":"Jameel","family":"Almalki","sequence":"additional","affiliation":[{"name":"Department of Computer Science, College of Computer in Al-Leith, Umm Al-Qura University, Makkah,  Saudi Arabia"}]},{"given":"Saeed M.","family":"Alshahrani","sequence":"additional","affiliation":[{"name":"College of Computing and Information Technology, Shaqra University, Shaqra, Saudi Arabia"}]}],"member":"4443","published-online":{"date-parts":[[2022,5,30]]},"reference":[{"key":"10.7717\/peerj-cs.967\/ref-1","doi-asserted-by":"publisher","first-page":"101492","DOI":"10.1016\/j.tele.2020.101492","article-title":"Important citation identification using sentiment analysis of in-text citations","volume":"56","author":"Aljuaid","year":"2021","journal-title":"Telematics and Informatics"},{"key":"10.7717\/peerj-cs.967\/ref-2","first-page":"2551","article-title":"Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents","author":"Alzaidy","year":"2019"},{"key":"10.7717\/peerj-cs.967\/ref-3","first-page":"180","article-title":"Bidirectional lstm recurrent neural network for keyphrase extraction","author":"Basaldella","year":"2018"},{"key":"10.7717\/peerj-cs.967\/ref-4","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/K18-1022","article-title":"Simple unsupervised keyphrase extraction using sentence embeddings","author":"Bennani-Smires","year":"2018"},{"issue":"Jan","key":"10.7717\/peerj-cs.967\/ref-5","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"10.7717\/peerj-cs.967\/ref-6","first-page":"517","article-title":"Multilingual single document keyword extraction for information retrieval","author":"Bracewell","year":"2005"},{"issue":"1\u20137","key":"10.7717\/peerj-cs.967\/ref-7","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1016\/S0169-7552(98)00110-X","article-title":"The anatomy of a large-scale hypertextual web search engine","volume":"30","author":"Brin","year":"1998","journal-title":"Computer Networks and ISDN Systems"},{"issue":"6","key":"10.7717\/peerj-cs.967\/ref-8","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9","article-title":"Indexing by latent semantic analysis","volume":"41","author":"Deerwester","year":"1990","journal-title":"Journal of the American Society for Information Science"},{"key":"10.7717\/peerj-cs.967\/ref-9","article-title":"Bert: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018"},{"issue":"1","key":"10.7717\/peerj-cs.967\/ref-10","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1016\/j.is.2008.05.002","article-title":"KP-Miner: a keyphrase extraction system for English and Arabic documents","volume":"34","author":"El-Beltagy","year":"2009","journal-title":"Information Systems"},{"issue":"2","key":"10.7717\/peerj-cs.967\/ref-11","doi-asserted-by":"publisher","first-page":"30","DOI":"10.3390\/mti4020030","article-title":"Semantic unsupervised automatic keyphrases extraction by integrating word embedding with clustering methods","volume":"4","author":"Gagliardi","year":"2020","journal-title":"Multimodal Technologies and Interaction"},{"key":"10.7717\/peerj-cs.967\/ref-12","doi-asserted-by":"publisher","first-page":"154290","DOI":"10.1109\/ACCESS.2019.2946594","article-title":"Target-dependent sentiment classification with BERT","volume":"7","author":"Gao","year":"2019","journal-title":"IEEE Access"},{"issue":"4","key":"10.7717\/peerj-cs.967\/ref-13","doi-asserted-by":"crossref","first-page":"e4956","DOI":"10.1002\/cpe.4956","article-title":"Impact analysis of adverbs for sentiment classification on Twitter product reviews","volume":"33","author":"Haider","year":"2021","journal-title":"Concurrency and Computation: Practice and Experience"},{"issue":"7","key":"10.7717\/peerj-cs.967\/ref-14","doi-asserted-by":"publisher","first-page":"1527","DOI":"10.1162\/neco.2006.18.7.1527","article-title":"A fast learning algorithm for deep belief nets","volume":"18","author":"Hinton","year":"2006","journal-title":"Neural Computation"},{"key":"10.7717\/peerj-cs.967\/ref-15","first-page":"216","article-title":"Improved automatic keyword extraction given more linguistic knowledge","author":"Hulth","year":"2003"},{"issue":"3","key":"10.7717\/peerj-cs.967\/ref-16","doi-asserted-by":"publisher","first-page":"153","DOI":"10.5391\/IJFIS.2015.15.3.153","article-title":"Latent keyphrase extraction using deep belief networks","volume":"15","author":"Jo","year":"2015","journal-title":"International Journal of Fuzzy Logic and Intelligent Systems"},{"key":"10.7717\/peerj-cs.967\/ref-17","doi-asserted-by":"publisher","first-page":"137090","DOI":"10.1109\/ACCESS.2019.2942322","article-title":"SwICS: section-wise in-text citation score","volume":"7","author":"Khan","year":"2019","journal-title":"IEEE Access"},{"issue":"5","key":"10.7717\/peerj-cs.967\/ref-18","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1145\/324133.324140","article-title":"Authoritative sources in a hyperlinked environment","volume":"46","author":"Kleinberg","year":"1999","journal-title":"Journal of the ACM (JACM)"},{"key":"10.7717\/peerj-cs.967\/ref-19","first-page":"9","article-title":"BERT for named entity recognition in contemporary and historical German","author":"Labusch","year":"2019"},{"key":"10.7717\/peerj-cs.967\/ref-20","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/W16-1609","article-title":"An empirical evaluation of doc2vec with practical insights into document embedding generation","author":"Lau","year":"2016"},{"key":"10.7717\/peerj-cs.967\/ref-21","first-page":"185","article-title":"Predicting abstract keywords by word vectors","author":"Li","year":"2015"},{"key":"10.7717\/peerj-cs.967\/ref-22","first-page":"257","article-title":"Clustering to find exemplar terms for keyphrase extraction","author":"Liu","year":"2009"},{"key":"10.7717\/peerj-cs.967\/ref-23","article-title":"Deep key phrase generation","author":"Meng","year":"2017"},{"key":"10.7717\/peerj-cs.967\/ref-24","first-page":"404","article-title":"Textrank: bringing order into text","author":"Mihalcea","year":"2004"},{"key":"10.7717\/peerj-cs.967\/ref-25","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013"},{"key":"10.7717\/peerj-cs.967\/ref-26","article-title":"Unsupervised learning of sentence embeddings using compositional n-gram features","author":"Pagliardini","year":"2017"},{"issue":"6","key":"10.7717\/peerj-cs.967\/ref-27","doi-asserted-by":"publisher","first-page":"888","DOI":"10.1016\/j.ipm.2018.06.004","article-title":"Local word vectors guiding keyphrase extraction","volume":"54","author":"Papagiannopoulou","year":"2018","journal-title":"Information Processing & Management"},{"key":"10.7717\/peerj-cs.967\/ref-28","first-page":"154","article-title":"Single document keyphrase extraction using sentence clustering and latent dirichlet allocation","author":"Pasquier","year":"2010"},{"key":"10.7717\/peerj-cs.967\/ref-29","first-page":"83","article-title":"A language-independent approach to keyphrase extraction and evaluation","author":"Paukkeri","year":"2008"},{"key":"10.7717\/peerj-cs.967\/ref-30","first-page":"1532","article-title":"Glove: global vectors for word representation","author":"Pennington","year":"2014"},{"key":"10.7717\/peerj-cs.967\/ref-31","doi-asserted-by":"publisher","first-page":"5554874","DOI":"10.1155\/2021\/5554874","article-title":"Exploiting contextual word embedding of authorship and title of articles for discovering citation intent classification","volume":"2021","author":"Roman","year":"2021","journal-title":"Complexity"},{"key":"10.7717\/peerj-cs.967\/ref-32","first-page":"1","article-title":"Automatic keyword extraction from individual documents","volume":"1","author":"Rose","year":"2010","journal-title":"Text Mining: Applications and Theory"},{"issue":"1","key":"10.7717\/peerj-cs.967\/ref-33","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1108\/eb026526","article-title":"A statistical interpretation of term specificity and its application in retrieval","volume":"28","author":"Sparck Jones","year":"1972","journal-title":"Journal of Documentation"},{"key":"10.7717\/peerj-cs.967\/ref-34","doi-asserted-by":"crossref","first-page":"e389-e389","DOI":"10.7717\/peerj-cs.389","article-title":"FNG-IE: an improved graph-based method for keyword extraction from scholarly big-data","volume":"7","author":"Tahir","year":"2021","journal-title":"PeerJ Computer Science"},{"key":"10.7717\/peerj-cs.967\/ref-35","first-page":"855","article-title":"Single document keyphrase extraction using neighborhood knowledge","author":"Wan","year":"2008"},{"key":"10.7717\/peerj-cs.967\/ref-36","first-page":"857","article-title":"Keyword extraction based on pagerank","author":"Wang","year":"2007"},{"key":"10.7717\/peerj-cs.967\/ref-37","first-page":"934","article-title":"PKU_ICL at SemEval-2017 task 10: Keyphrase extraction with model ensemble and external knowledge","author":"Wang","year":"2017"},{"key":"10.7717\/peerj-cs.967\/ref-38","first-page":"54","article-title":"Keyword extraction using word co-occurrence","author":"Wartena","year":"2010"},{"key":"10.7717\/peerj-cs.967\/ref-39","first-page":"254","article-title":"C. nevillmanning, kea: practical automatic keyphrase extraction","author":"Witten","year":"1999"},{"key":"10.7717\/peerj-cs.967\/ref-40","first-page":"836","article-title":"Keyphrase extraction using deep recurrent neural networks on twitter","author":"Zhang","year":"2016"}],"container-title":["PeerJ Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/peerj.com\/articles\/cs-967.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-967.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-967.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-967.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,30]],"date-time":"2022-05-30T07:23:24Z","timestamp":1653895404000},"score":1,"resource":{"primary":{"URL":"https:\/\/peerj.com\/articles\/cs-967"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,5,30]]},"references-count":40,"alternative-id":["10.7717\/peerj-cs.967"],"URL":"https:\/\/doi.org\/10.7717\/peerj-cs.967","archive":["CLOCKSS","LOCKSS","Portico"],"relation":{},"ISSN":["2376-5992"],"issn-type":[{"value":"2376-5992","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,5,30]]},"article-number":"e967"}}