From Language to Location using Multiple Instance Neural Networks

dc.contributor.advisorRangwala, Huzefa
dc.contributor.authorNagpaul, Sneha
dc.creatorNagpaul, Sneha
dc.date2018-01-17
dc.date.accessioned2018-06-11T20:23:10Z
dc.date.available2018-06-11T20:23:10Z
dc.description.abstractGiven the deluge of data caused by crowd generated content from social media websites, the complexity of extracting information from has increased manifold. An important characteristic of such text is its original location which can in turn be used to respond to emergencies such as oods and crimes. The patterns discovered by such geolocation of social media related unstructured text can also be used commercially for targeted advertising and recommender systems. This work deals with geolocating short texts from social media that are labeled with a user's information. However, instead of locating the user who can be viewed as a collection of these texts, it focuses on locating each such text, here a tweet. For this task, the problem is described within the multiple instance learning framework and a novel approach using neural networks is designed which trains a tweet level classifier using only user location labels. The model outperforms the state of the art in multiple instance learning and provides significant scalability and speedup compared to existing methods. Exceeding the Bag of Words models prevalent in prior geolocation research, the intuitive tweet level neural network classifier discovers complex features such as grammar and identifies name places without feature engineering.
dc.identifierdoi:10.13021/G8PQ5Z
dc.identifier.urihttps://hdl.handle.net/1920/10993
dc.language.isoen
dc.subjectNLP
dc.subjectText geolocation
dc.subjectNeural networks
dc.subjectMIL
dc.titleFrom Language to Location using Multiple Instance Neural Networks
dc.typeThesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorGeorge Mason University
thesis.degree.levelMaster's
thesis.degree.nameMaster of Science in Computer Science

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nagpaul_thesis_2018.pdf
Size:
3.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.52 KB
Format:
Item-specific license agreed upon to submission
Description: