On Privacy in Spatio-Temporal Data: User Identification Using Geotagged Social Media

Date

Authors

Seglem, Erik

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Location data is among the most sensitive data regarding the privacy of the observed users. To collect location data, mobile phones and other mobile devices constantly track their positions. This work examines the question whether publicly available spatio-temporal user data can be used to link newly observed location data to known user profiles. For this study, publicly available location information about Twitter users is used to construct spatio-temporal user profiles describing a user's movement in space and time. It shows how to use these profiles to match a new location trace to their user with high accuracy. Furthermore, it shows how to link users of two different trace data sets. For this case study, 15,989 of the most prolific Twitter users in London in 2014 are considered. The experimental results show that the classification approach allows to correctly identify 98 % of the most prolific 500 of these users. Furthermore, it can correctly identify more than 50 % of any users by using three observations of these users, rather than their whole location trace. This alarming result shows that spatio-temporal data is highly discriminative, thus putting the privacy of hundreds of millions of geo-social network users at a risk. It further shows that it can correctly match most users of Instagram to users of Twitter.

Description

Keywords

Social media, User identification, Twitter, Privacy, Point patterns

Citation