Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/11368
Title: Keyword Extraction for Hindi language
Authors: Saxena, Namit
Keywords: Computer 2020
Project Report 2020
Computer Project Report
Project Report
20MCEI
20MCEI10
INS
INS 2020
CE (INS)
Issue Date: 1-Jun-2022
Publisher: Institute of Technology
Series/Report no.: 20MCEI10;
Abstract: In this internet era, there are several sources where text documents in the Hindi language are created daily, like Government sites, public and private sector, and news portals, which are so enormous that they must be classified correctly into labeled categories. Therefore, there are various applications available for Hindi text-based processing, and there is an excellent extent for extraction of text from Hindi language documents into predefined categories. This study proposed a different method to extract keywords in Hindi documents using unsupervised learning. In our approach, it will create an n-gram for a particular document. After that, it will be passed to the language model (LaBSE Model) to understand contextual information in sentences better. Finally, it will create a vector space compared with an n-gram vector space to find relevant keywords from the document. This experiment has been performed on four categories sport, business, entertainment, and science.
URI: http://10.1.7.192:80/jspui/handle/123456789/11368
Appears in Collections:Dissertation, CE (INS)

Files in This Item:
File Description SizeFormat 
20MCEI10.pdf20MCEI101.86 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.