Bert Model for Social Media Bot Detection
Date
2022-03
Authors
Heidari, Maryam
Jones, James H Jr.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Millions of online posts about different topics and products are shared on popular social media platforms. One use of this content is to provide crowd-sourced information about a specific topic, event, or product. However, this use raises an important question: what percentage of the information available through these services is trustworthy? In particular, might some of this information be generated by a machine, i.e., a ``bot" instead of a human? Bots can be, and often are, purposely designed to generate enough volume to skew an apparent trend or position on a topic, yet the consumer of such content cannot easily distinguish a bot post from a human post. This paper introduces a new model that uses Bidirectional Encoder Representations from Transformers (Google Bert) for sentiment classification of tweets to identify topic-independent features for the social media bot detection model. Using a Natural Language Processing approach to derive topic-independent features for the new bot detection model distinguishes this work from previous bot detection models. We achieve 94\% accuracy classifying the contents of data set Cresci \cite{cresci-etal-2017-paradigm}as generated by a bot or a human, where the most accurate prior work achieved an accuracy of 92\%.
Description
Keywords
Bot detection, Natural Language Processing, Neural Network, Social Media