Comment Mining, Popularity Prediction, and Social Network Analysis

dc.contributor.advisorRangwala, Huzefa
dc.contributor.authorJamali, Salman
dc.creatorJamali, Salman
dc.date2009-12-17
dc.date.accessioned2010-05-14T19:11:36Z
dc.date.availableNO_RESTRICTION
dc.date.available2010-05-14T19:11:36Z
dc.date.issued2010-05-14T19:11:36Z
dc.description.abstractWith the growing number of online collaborative news aggregator social websites, we witness thousands of comments posted by the internet community on individual news items shared on such networks. We started out with an objective to exhaustively analyze these comments for extracting insightful information about their various collective aspects. For our study, we worked with the data of one of the most popular news aggregator websites, called Digg1. Using Egonet analysis for projecting local neighborhoods, we identified the characteristics of highly active individual users with and without time constraints. The time-based egonets effectively improved our ability to visualize variations in user activity patterns. We proposed a framework to apply data mining techniques to these comments (and comment threads), which helped us in predicting the popularity of news stories. We reported a very small loss of 1.0-4.0% in multiclass classification accuracy while predicting the popularity score using the first few hours of comment data in comparison to all the available comment data. We found that Digg community was highly active in posting comments and found their focus to be spread across a wide range of topics. We also performed a comparative analysis of two network formations: co-participation and reply-answer. This helped us in comparing these implicit networks that we derived with characteristic attributes of social networks. Further, we conducted preliminary experiments to improve the strength of a link in our co-participation network by analyzing the positive, negative or neutral sentiments expressed by users in their commentaries. One important application of our work lies in a provision of unique and rich information to advertisers enabling them to target certain commenters as potential customers. Our framework can also be tweaked to forewarn web administrators against a potential Digg Effect (Section 8.1).
dc.identifier.urihttps://hdl.handle.net/1920/5810
dc.language.isoen_US
dc.subjectSocial Networking Analysis
dc.subjectDigg
dc.subjectComment mining
dc.subjectSocial bookmarking
dc.titleComment Mining, Popularity Prediction, and Social Network Analysis
dc.typeThesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorGeorge Mason University
thesis.degree.levelMaster's
thesis.degree.nameMaster of Science in Computer Science

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Salman Jamali - MS Thesis.pdf
Size:
1.56 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.72 KB
Format:
Item-specific license agreed upon to submission
Description: