#COVID-19 Searching for a Relationship between Twitter Sentiment and Infectious Disease


Journal Title

Journal ISSN

Volume Title



Digital health data such as social media data has shown potential for identifying outbreaks faster than official records of disease incidence. The objective of this thesis 1 is to examine the relationship between COVID-19-related Tweet sentiment and COVID-19 cases over space and time and assess the extent to which Twitter-derived sentiment can be used for local COVID-19 surveillance in the United States. To our knowledge, there is no existing study that examines the relationship between Tweet sentiment and infectious disease cases at a spatially local level. The sentiment is computed using 56,755,894 Tweets from the TBCOV dataset for US counties over time. Tweet sentiment is examined with COVID-19 cases for each county globally, over time, and space using Pearson's R correlation. A negative association was observed between COVID-19 cases and the sentiment polarity of COVID- 19 tweets, but only in some regions of the US and only for some duration of the period of study. Further research is needed to understand the cause of the spatial and temporal non-stationary correlations between Twitter sentiment and COVID-19 cases. This would allow for the identification of when and where Twitter sentiment could be used as a signal for early disease outbreak warning.