Geo-Fingerprinting of Social Media Content



Gazaz, Hatim

Journal Title

Journal ISSN

Volume Title



With the percentage of Twitter users approaching 20% of the US population by 2019, tweets provide a good sample of the public’s sentiment and opinion. Consequently such data has been excessively used in commercial and research efforts. While works have analyzed the content of tweets in relation to the underlying social network of a discussion, somewhat less attention has been paid to the spatial distribution of messages and topics. This thesis tries to assess the locality of discussions using the concepts mentioned in tweets. Based on a global distribution of topics across the 48 contiguous states, spatial topic dissimilarity is discovered by recursively subdividing the space into smaller and smaller partitions and using statistical testing to compare the distributions. Experimenting with a large Twitter dataset for the US, locality of a discussion was observed to occur at specific thresholds and only 14 of the 49 most populous urban areas feature a unique discussion. Overall, this work establishes trends as to when locality in a discussion in social media occurs.



Social media, Twitter, Geospatial, Entity extraction