From Language to Location using Multiple Instance Neural Networks



Nagpaul, Sneha

Journal Title

Journal ISSN

Volume Title



Given the deluge of data caused by crowd generated content from social media websites, the complexity of extracting information from has increased manifold. An important characteristic of such text is its original location which can in turn be used to respond to emergencies such as oods and crimes. The patterns discovered by such geolocation of social media related unstructured text can also be used commercially for targeted advertising and recommender systems. This work deals with geolocating short texts from social media that are labeled with a user's information. However, instead of locating the user who can be viewed as a collection of these texts, it focuses on locating each such text, here a tweet. For this task, the problem is described within the multiple instance learning framework and a novel approach using neural networks is designed which trains a tweet level classifier using only user location labels. The model outperforms the state of the art in multiple instance learning and provides significant scalability and speedup compared to existing methods. Exceeding the Bag of Words models prevalent in prior geolocation research, the intuitive tweet level neural network classifier discovers complex features such as grammar and identifies name places without feature engineering.



NLP, Text geolocation, Neural networks, MIL