Event Detection from Social Media for Epidemic Prediction

2404.01679

YC

0

Reddit

0

Published 5/27/2024 by Tanmay Parekh, Anh Mac, Jiarui Yu, Yuxuan Dong, Syed Shahriar, Bonnie Liu, Eric Yang, Kuan-Hao Huang, Wei Wang, Nanyun Peng and 1 other
Event Detection from Social Media for Epidemic Prediction

Abstract

Social media is an easy-to-access platform providing timely updates about societal trends and events. Discussions regarding epidemic-related events such as infections, symptoms, and social interactions can be crucial for informing policymaking during epidemic outbreaks. In our work, we pioneer exploiting Event Detection (ED) for better preparedness and early warnings of any upcoming epidemic by developing a framework to extract and analyze epidemic-related events from social media posts. To this end, we curate an epidemic event ontology comprising seven disease-agnostic event types and construct a Twitter dataset SPEED with human-annotated events focused on the COVID-19 pandemic. Experimentation reveals how ED models trained on COVID-based SPEED can effectively detect epidemic events for three unseen epidemics of Monkeypox, Zika, and Dengue; while models trained on existing ED datasets fail miserably. Furthermore, we show that reporting sharp increases in the extracted events by our framework can provide warnings 4-9 weeks earlier than the WHO epidemic declaration for Monkeypox. This utility of our framework lays the foundations for better preparedness against emerging epidemics.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper presents a new method for detecting events from social media data and using that information to predict the spread of epidemics.
  • The researchers developed a machine learning model that can identify relevant events from social media posts and then use those events to forecast the growth of disease outbreaks.
  • The goal is to provide an early warning system that can help public health officials prepare for and respond to emerging epidemics more effectively.

Plain English Explanation

The paper tackles the challenge of using social media data to predict the spread of infectious diseases. The researchers recognized that online conversations can contain valuable clues about emerging health issues, but extracting that information in a reliable way is difficult.

To address this, the team created a machine learning model that can scan through millions of social media posts and detect mentions of events that may be linked to the start or spread of an epidemic, such as people reporting symptoms, discussing new cases, or sharing news about outbreaks. Once these relevant events are identified, the model can then use that information to forecast how the disease might continue to propagate.

The goal is to provide public health agencies with an early warning system that can help them get ahead of outbreaks before they spiral out of control. If the model can reliably spot the initial signs of an epidemic emerging based on online chatter, it could give officials precious time to mobilize resources, implement containment measures, and educate the public - potentially saving many lives.

Technical Explanation

The paper proposes a novel approach for leveraging social media data to predict the course of epidemic diseases. The researchers developed a two-stage machine learning model that first detects relevant events from social media posts and then uses those events to forecast the growth trajectory of an outbreak.

In the first stage, the model employs natural language processing techniques to scan social media for mentions of events that could be indicative of emerging health issues, such as people reporting symptoms, discussing new infections, or sharing news about outbreaks. A classification algorithm is trained to identify these relevant events with high accuracy.

The second stage of the model then takes the detected events as input and uses them to predict the future spread of the disease. The researchers experimented with various time series forecasting methods, including LSTM neural networks and autoregressive integrated moving average (ARIMA) models, to generate predictions about the trajectory of the epidemic based on the social media-derived event data.

The model was evaluated on real-world data from past disease outbreaks, and the results showed that incorporating the social media event detection component could significantly improve forecasting performance compared to baseline approaches that only used traditional epidemiological data sources.

Critical Analysis

The paper presents a promising new approach for leveraging social media data to enhance epidemic prediction capabilities. The event detection and forecasting framework demonstrated strong empirical results, suggesting that this technique could be a valuable tool for public health monitoring and preparedness.

However, the researchers acknowledge several important limitations and avenues for future work. First, the performance of the model is heavily dependent on the quality and coverage of the social media data used for training. If the online discussions do not sufficiently represent the actual progression of the disease, the predictions may be biased or inaccurate.

Additionally, the paper does not address potential privacy and ethical concerns around the large-scale collection and analysis of social media data for disease surveillance purposes. As this technology advances, it will be crucial to ensure appropriate safeguards are in place to protect individual privacy and prevent misuse of the data.

Further research is also needed to understand how this approach would perform across different types of disease outbreaks, as the patterns and signals in social media may vary considerably. Validating the generalizability of the method to a wider range of epidemic scenarios would be an important next step.

Conclusion

Overall, this paper presents a compelling new framework for harnessing social media data to enhance epidemic prediction capabilities. By developing a model that can detect relevant events from online conversations and then use that information to forecast disease spread, the researchers have unveiled a promising early warning system that could significantly improve public health preparedness and response.

While further research is needed to address the limitations and expand the applicability of the approach, this work represents an important step forward in leveraging the wealth of data available on social media platforms to tackle critical global health challenges.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Master of Disaster: A Disaster-Related Event Monitoring System From News Streams

Master of Disaster: A Disaster-Related Event Monitoring System From News Streams

Junbo Huang, Ricardo Usbeck

YC

0

Reddit

0

The need for a disaster-related event monitoring system has arisen due to the societal and economic impact caused by the increasing number of severe disaster events. An event monitoring system should be able to extract event-related information from texts, and discriminates event instances. We demonstrate our open-source event monitoring system, namely, Master of Disaster (MoD), which receives news streams, extracts event information, links extracted information to a knowledge graph (KG), in this case Wikidata, and discriminates event instances visually. The goal of event visualization is to group event mentions referring to the same real-world event instance so that event instance discrimination can be achieved by visual screening.

Read more

6/14/2024

🛠️

Word frequency and sentiment analysis of twitter messages during Coronavirus pandemic

Nikhil Kumar Rajput, Bhavya Ahuja Grover, Vipin Kumar Rathi, Riya Bansal

YC

0

Reddit

0

The COVID-19 epidemic has had a great impact on social media conversation, especially on sites like Twitter, which has emerged as a hub for public reaction and information sharing. This paper deals by analyzing a vast dataset of Twitter messages related to this disease, starting from January 2020. Two approaches were used: a statistical analysis of word frequencies and a sentiment analysis to gauge user attitudes. Word frequencies are modeled using unigrams, bigrams, and trigrams, with power law distribution as the fitting model. The validity of the model is confirmed through metrics like Sum of Squared Errors (SSE), R-squared ($R^2$), and Root Mean Squared Error (RMSE). High $R^2$ and low SSE/RMSE values indicate a good fit for the model. Sentiment analysis is conducted to understand the general emotional tone of Twitter users messages. The results reveal that a majority of tweets exhibit neutral sentiment polarity, with only 2.57% expressing negative polarity.

Read more

6/4/2024

⚙️

Measuring Online Emotional Reactions to Events

Siyi Guo, Zihao He, Ashwin Rao, Eugene Jang, Yuanfeixue Nan, Fred Morstatter, Jeffrey Brantingham, Kristina Lerman

YC

0

Reddit

0

The rich and dynamic information environment of social media provides researchers, policy makers, and entrepreneurs with opportunities to learn about social phenomena in a timely manner. However, using this data to understand social behavior is difficult due heterogeneity of topics and events discussed in the highly dynamic online information environment. To address these challenges, we present a method for systematically detecting and measuring emotional reactions to offline events using change point detection on the time series of collective affect, and further explaining these reactions using a transformer-based topic model. We demonstrate the utility of the method on a corpus of tweets from a large US metropolitan area between January and August, 2020, covering a period of great social change. We demonstrate that our method is able to disaggregate topics to measure population's emotional and moral reactions. This capability allows for better monitoring of population's reactions during crises using online data.

Read more

4/1/2024

Leveraging World Events to Predict E-Commerce Consumer Demand under Anomaly

Dan Kalifa, Uriel Singer, Ido Guy, Guy D. Rosin, Kira Radinsky

YC

0

Reddit

0

Consumer demand forecasting is of high importance for many e-commerce applications, including supply chain optimization, advertisement placement, and delivery speed optimization. However, reliable time series sales forecasting for e-commerce is difficult, especially during periods with many anomalies, as can often happen during pandemics, abnormal weather, or sports events. Although many time series algorithms have been applied to the task, prediction during anomalies still remains a challenge. In this work, we hypothesize that leveraging external knowledge found in world events can help overcome the challenge of prediction under anomalies. We mine a large repository of 40 years of world events and their textual representations. Further, we present a novel methodology based on transformers to construct an embedding of a day based on the relations of the day's events. Those embeddings are then used to forecast future consumer behavior. We empirically evaluate the methods over a large e-commerce products sales dataset, extracted from eBay, one of the world's largest online marketplaces. We show over numerous categories that our method outperforms state-of-the-art baselines during anomalies.

Read more

5/24/2024