Please use this identifier to cite or link to this item:
http://10.1.7.192:80/jspui/handle/123456789/11368
Title: | Keyword Extraction for Hindi language |
Authors: | Saxena, Namit |
Keywords: | Computer 2020 Project Report 2020 Computer Project Report Project Report 20MCEI 20MCEI10 INS INS 2020 CE (INS) |
Issue Date: | 1-Jun-2022 |
Publisher: | Institute of Technology |
Series/Report no.: | 20MCEI10; |
Abstract: | In this internet era, there are several sources where text documents in the Hindi language are created daily, like Government sites, public and private sector, and news portals, which are so enormous that they must be classified correctly into labeled categories. Therefore, there are various applications available for Hindi text-based processing, and there is an excellent extent for extraction of text from Hindi language documents into predefined categories. This study proposed a different method to extract keywords in Hindi documents using unsupervised learning. In our approach, it will create an n-gram for a particular document. After that, it will be passed to the language model (LaBSE Model) to understand contextual information in sentences better. Finally, it will create a vector space compared with an n-gram vector space to find relevant keywords from the document. This experiment has been performed on four categories sport, business, entertainment, and science. |
URI: | http://10.1.7.192:80/jspui/handle/123456789/11368 |
Appears in Collections: | Dissertation, CE (INS) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
20MCEI10.pdf | 20MCEI10 | 1.86 MB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.