Keyword Extraction for Hindi language

Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/11368

Title:	Keyword Extraction for Hindi language
Authors:	Saxena, Namit
Keywords:	Computer 2020 Project Report 2020 Computer Project Report Project Report 20MCEI 20MCEI10 INS INS 2020 CE (INS)
Issue Date:	1-Jun-2022
Publisher:	Institute of Technology
Series/Report no.:	20MCEI10;
Abstract:	In this internet era, there are several sources where text documents in the Hindi language are created daily, like Government sites, public and private sector, and news portals, which are so enormous that they must be classified correctly into labeled categories. Therefore, there are various applications available for Hindi text-based processing, and there is an excellent extent for extraction of text from Hindi language documents into predefined categories. This study proposed a different method to extract keywords in Hindi documents using unsupervised learning. In our approach, it will create an n-gram for a particular document. After that, it will be passed to the language model (LaBSE Model) to understand contextual information in sentences better. Finally, it will create a vector space compared with an n-gram vector space to find relevant keywords from the document. This experiment has been performed on four categories sport, business, entertainment, and science.
URI:	http://10.1.7.192:80/jspui/handle/123456789/11368
Appears in Collections:	Dissertation, CE (INS)

Files in This Item:

File	Description	Size	Format
20MCEI10.pdf	20MCEI10	1.86 MB	Adobe PDF	View/Open

IR @ Nirma University