Please use this identifier to cite or link to this item:
http://10.1.7.192:80/jspui/handle/123456789/4848
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Joshi, Rutu | - |
dc.date.accessioned | 2014-08-19T07:53:38Z | - |
dc.date.available | 2014-08-19T07:53:38Z | - |
dc.date.issued | 2014-06-01 | - |
dc.identifier.uri | http://hdl.handle.net/123456789/4848 | - |
dc.description.abstract | Classification of web pages is essential for improving the quality of web search, focused crawling, development of web directories like Yahoo, ODP etc. This paper compares various classification techniques for the task of web page classification. The classification techniques compared include k nearest neighbours (KNN), Naive Bayes (NB), support vector machine (SVM), classification and regression trees (CART) random forest (RF) and particle swarm optimization (PSO).Impact of using different representations of web pages is also studied. The different representations of the web pages that are used comprise Boolean, bag-of-words and term frequency and inverse document frequency (TFIDF). Experiments are performed using WebKB and R8 datasets. Accuracy and f-measure are used as the evaluation measures. Impact of feature selection on the accuracy of the classifier is moreover demonstrated. | en_US |
dc.publisher | Institute of Technology | en_US |
dc.relation.ispartofseries | 12MCEC11; | - |
dc.subject | Computer 2012 | en_US |
dc.subject | Project Report 2012 | en_US |
dc.subject | Computer Project Report | en_US |
dc.subject | Project Report | en_US |
dc.subject | 12MCE | en_US |
dc.subject | 12MCEC | en_US |
dc.subject | 12MCEC11 | en_US |
dc.title | Web Page Classification | en_US |
dc.type | Dissertation | en_US |
Appears in Collections: | Dissertation, CE |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
12MCEC11.pdf | 12MCEC11 | 640.75 kB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.