Bert Model for Social Media Bot Detection




Heidari, Maryam
Jones, James H Jr.

Journal Title

Journal ISSN

Volume Title



Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event, or product. However, this use raises an important question: what percentage of the information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a ``bot" instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. This paper introduces a new model that uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features for the social media bot detection model. Using a Natural Language Processing approach to derive topic-independent features for the new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data set Cresci \cite{cresci-etal-2017-paradigm}as generated by a bot or a human, where the most accurate prior work achieved an accuracy of 92\%.



Bot detection, Natural Language Processing, Neural Network, Social Media