Comment Mining, Popularity Prediction, and Social Network Analysis
dc.contributor.advisor | Rangwala, Huzefa | |
dc.contributor.author | Jamali, Salman | |
dc.creator | Jamali, Salman | |
dc.date | 2009-12-17 | |
dc.date.accessioned | 2010-05-14T19:11:36Z | |
dc.date.available | NO_RESTRICTION | |
dc.date.available | 2010-05-14T19:11:36Z | |
dc.date.issued | 2010-05-14T19:11:36Z | |
dc.description.abstract | With the growing number of online collaborative news aggregator social websites, we witness thousands of comments posted by the internet community on individual news items shared on such networks. We started out with an objective to exhaustively analyze these comments for extracting insightful information about their various collective aspects. For our study, we worked with the data of one of the most popular news aggregator websites, called Digg1. Using Egonet analysis for projecting local neighborhoods, we identified the characteristics of highly active individual users with and without time constraints. The time-based egonets effectively improved our ability to visualize variations in user activity patterns. We proposed a framework to apply data mining techniques to these comments (and comment threads), which helped us in predicting the popularity of news stories. We reported a very small loss of 1.0-4.0% in multiclass classification accuracy while predicting the popularity score using the first few hours of comment data in comparison to all the available comment data. We found that Digg community was highly active in posting comments and found their focus to be spread across a wide range of topics. We also performed a comparative analysis of two network formations: co-participation and reply-answer. This helped us in comparing these implicit networks that we derived with characteristic attributes of social networks. Further, we conducted preliminary experiments to improve the strength of a link in our co-participation network by analyzing the positive, negative or neutral sentiments expressed by users in their commentaries. One important application of our work lies in a provision of unique and rich information to advertisers enabling them to target certain commenters as potential customers. Our framework can also be tweaked to forewarn web administrators against a potential Digg Effect (Section 8.1). | |
dc.identifier.uri | https://hdl.handle.net/1920/5810 | |
dc.language.iso | en_US | |
dc.subject | Social Networking Analysis | |
dc.subject | Digg | |
dc.subject | Comment mining | |
dc.subject | Social bookmarking | |
dc.title | Comment Mining, Popularity Prediction, and Social Network Analysis | |
dc.type | Thesis | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | George Mason University | |
thesis.degree.level | Master's | |
thesis.degree.name | Master of Science in Computer Science |