Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/4860
Title: Spam Detection in Social Bookmarking Systems
Authors: Sejpal, Mittal
Keywords: Computer 2012
Project Report 2012
Computer Project Report
Project Report
12MICT
12MICT52
ICT
ICT 2012
CE (ICT)
Issue Date: 1-Jun-2014
Publisher: Institute of Technology
Series/Report no.: 12MICT52;
Abstract: Social bookmarking websites have recently become well-known for collecting and sharing of interesting Web sites among users. People can add Web pages to such sites as bookmarks and allow themselves as well as others to work on them. One of the key features of the social book marking sites is the ability of annotating a Web page when it is being bookmarked. The annotation usually contains a set of words or phrases, which are collectively known as tags that could reveal the semantics of the annotated Web page. Efficient and effective search of Web pages can then be achieved via such tags. However, spam tags that are irrelevant to the content of Web pages often appear to deceive other users for malicious or commercial purposes. Manual Spam Detection is very difficult. The main purpose is to automate the manual spam Detection. In this work, focus is on the detection of spam users in Social Bookmarking Systems. Experimental evaluation is done using ECML PKDD discovery challenge 2008 dataset. Naive bayes and K Nearest Neighbor classifier are applied on all three Information Retrieval Models(Boolean, Word Count and TF-IDF). Information Gain is used as feature selection measure and further all the Information Retrieval Models are trained with the mentioned classifiers. Naive Bayes Classifier gives Promising results with only few attributes with feature selection.
URI: http://hdl.handle.net/123456789/4860
Appears in Collections:Dissertation, CE (ICT)

Files in This Item:
File Description SizeFormat 
12MICT52.pdf12MICT52813.18 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.