Seasonality Patterns in 311-Reported Foodborne Illness Cases and Machine Learning-Identified Indications of Foodborne Illnesses from Yelp Reviews, New York City, 2022-2023

Read original: arXiv:2405.06138 - Published 5/13/2024 by Eden Shaveet, Crystal Su, Daniel Hsu, Luis Gravano
Total Score

0

Seasonality Patterns in 311-Reported Foodborne Illness Cases and Machine Learning-Identified Indications of Foodborne Illnesses from Yelp Reviews, New York City, 2022-2023

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper examines the seasonality patterns of foodborne illness cases reported through New York City's 311 system and identifies potential indications of foodborne illnesses from Yelp reviews using machine learning techniques.
  • The researchers analyzed data from 2022-2023 to uncover insights that could aid in the early detection and prevention of foodborne disease outbreaks.

Plain English Explanation

The paper looks at how the number of foodborne illness cases reported to New York City's 311 system changes throughout the year. It also uses machine learning to try to identify signs of foodborne illness in online reviews of restaurants and food establishments.

The goal is to find patterns in when foodborne illnesses are most common and see if online reviews can provide early warning signs of potential outbreaks. This information could help public health officials be more proactive in monitoring and responding to foodborne disease risks.

For example, the research on early detection of disease outbreaks using non-traditional data sources has shown how analyzing things like social media and internet search data can complement traditional surveillance methods. Similarly, this paper explores whether Yelp reviews could be a valuable additional source for identifying foodborne illness trends.

Technical Explanation

The researchers analyzed 311 service request data from New York City to identify cases related to foodborne illness from 2022-2023. They looked for seasonal patterns and trends in the volume of these reports over the course of the year.

In parallel, the team used natural language processing and machine learning techniques to scan Yelp reviews for the same time period. The goal was to detect language indicative of foodborne illness, such as mentions of food poisoning, vomiting, diarrhea, etc. This "medical signal" extracted from the online reviews was then compared to the 311 data to see if the two data sources aligned in identifying foodborne disease patterns.

The Bayesian regression approach to estimating the impact of COVID-19 served as a methodological inspiration for parts of this analysis. And the team drew on prior work like the explainable machine learning model for predicting shellfish toxicity in the Adriatic Sea to inform their machine learning approach.

Critical Analysis

The paper provides a thorough analysis of the seasonality trends in foodborne illness cases and explores the potential of using online review data as a complementary source of information. However, the researchers acknowledge several limitations:

  • The 311 data only captures a subset of actual foodborne illness cases, as many people may not report through that system.
  • Relying on language in Yelp reviews to detect foodborne illness has inherent challenges, as reviewers may not always explicitly mention relevant symptoms.

Additionally, the study is limited to a single city (New York) and a relatively short time frame (2022-2023). Expanding the analysis to other locations and longer time periods could yield additional insights and help validate the findings.

The research on understanding social perceptions of safety aspects on sidewalks highlights the importance of considering contextual factors and human experiences when analyzing data from non-traditional sources like online reviews.

Conclusion

This paper takes an innovative approach to monitoring foodborne illness trends by combining traditional 311 reporting data with machine learning analysis of online reviews. The findings suggest that these complementary data sources can provide a more comprehensive view of when and where foodborne illnesses are occurring.

The insights from this research could inform more proactive public health interventions and improve the early detection of potential disease outbreaks. As the meal recommendation dataset MealRecDollarDollar demonstrates, leveraging diverse data streams can lead to valuable discoveries in the food and health domains.

Overall, this study highlights the potential of integrating multiple data sources and advanced analytics to enhance our understanding of complex public health challenges.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Seasonality Patterns in 311-Reported Foodborne Illness Cases and Machine Learning-Identified Indications of Foodborne Illnesses from Yelp Reviews, New York City, 2022-2023
Total Score

0

Seasonality Patterns in 311-Reported Foodborne Illness Cases and Machine Learning-Identified Indications of Foodborne Illnesses from Yelp Reviews, New York City, 2022-2023

Eden Shaveet, Crystal Su, Daniel Hsu, Luis Gravano

Restaurants are critical venues at which to investigate foodborne illness outbreaks due to shared sourcing, preparation, and distribution of foods. Formal channels to report illness after food consumption, such as 311, New York City's non-emergency municipal service platform, are underutilized. Given this, online social media platforms serve as abundant sources of user-generated content that provide critical insights into the needs of individuals and populations. We extracted restaurant reviews and metadata from Yelp to identify potential outbreaks of foodborne illness in connection with consuming food from restaurants. Because the prevalence of foodborne illnesses may increase in warmer months as higher temperatures breed more favorable conditions for bacterial growth, we aimed to identify seasonal patterns in foodborne illness reports from 311 and identify seasonal patterns of foodborne illness from Yelp reviews for New York City restaurants using a Hierarchical Sigmoid Attention Network (HSAN). We found no evidence of significant bivariate associations between any variables of interest. Given the inherent limitations of relying solely on user-generated data for public health insights, it is imperative to complement these sources with other data streams and insights from subject matter experts. Future investigations should involve conducting these analyses at more granular spatial and temporal scales to explore the presence of such differences or associations.

Read more

5/13/2024

Sentiment Polarity Analysis of Bangla Food Reviews Using Machine and Deep Learning Algorithms
Total Score

0

Sentiment Polarity Analysis of Bangla Food Reviews Using Machine and Deep Learning Algorithms

Al Amin, Anik Sarkar, Md Mahamodul Islam, Asif Ahammad Miazee, Md Robiul Islam, Md Mahmudul Hoque

The Internet has become an essential tool for people in the modern world. Humans, like all living organisms, have essential requirements for survival. These include access to atmospheric oxygen, potable water, protective shelter, and sustenance. The constant flux of the world is making our existence less complicated. A significant portion of the population utilizes online food ordering services to have meals delivered to their residences. Although there are numerous methods for ordering food, customers sometimes experience disappointment with the food they receive. Our endeavor was to establish a model that could determine if food is of good or poor quality. We compiled an extensive dataset of over 1484 online reviews from prominent food ordering platforms, including Food Panda and HungryNaki. Leveraging the collected data, a rigorous assessment of various deep learning and machine learning techniques was performed to determine the most accurate approach for predicting food quality. Out of all the algorithms evaluated, logistic regression emerged as the most accurate, achieving an impressive 90.91% accuracy. The review offers valuable insights that will guide the user in deciding whether or not to order the food.

Read more

5/14/2024

COVID-19's Unequal Toll: An assessment of small business impact disparities with respect to ethnorace in metropolitan areas in the US using mobility data
Total Score

0

COVID-19's Unequal Toll: An assessment of small business impact disparities with respect to ethnorace in metropolitan areas in the US using mobility data

Saad Mohammad Abrar, Kazi Tasnim Zinat, Naman Awasthi, Vanessa Frias-Martinez

Early in the pandemic, counties and states implemented a variety of non-pharmacological interventions (NPIs) focused on mobility, such as national lockdowns or work-from-home strategies, as it became clear that restricting movement was essential to containing the epidemic. Due to these restrictions, businesses were severely affected and in particular, small, urban restaurant businesses. In addition to that, COVID-19 has also amplified many of the socioeconomic disparities and systemic racial inequities that exist in our society. The overarching objective of this study was to examine the changes in small urban restaurant visitation patterns following the COVID-19 pandemic and associated mobility restrictions, as well as to uncover potential disparities across different racial/ethnic groups in order to understand inequities in the impact and recovery. Specifically, the two key objectives were: 1) to analyze the overall changes in restaurant visitation patterns in US metropolitan areas during the pandemic compared to a pre-pandemic baseline, and 2) to investigate differences in visitation pattern changes across Census Block Groups with majority Asian, Black, Hispanic, White, and American Indian populations, identifying any disproportionate effects. Using aggregated geolocated cell phone data from SafeGraph, we document the overall changes in small urban restaurant businesses' visitation patterns with respect to racial composition at a granularity of Census Block Groups. Our results show clear indications of reduced visitation patterns after the pandemic, with slow recoveries. Via visualizations and statistical analyses, we show that reductions in visitation patterns were the highest for small urban restaurant businesses in majority Asian neighborhoods.

Read more

5/21/2024

🔎

Total Score

0

Early detection of disease outbreaks and non-outbreaks using incidence data

Shan Gao, Amit K. Chakraborty, Russell Greiner, Mark A. Lewis, Hao Wang

Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management. Here, we develop a general model, with no real-world training data, that accurately forecasts outbreaks and non-outbreaks. We propose a novel framework, using a feature-based time series classification method to forecast outbreaks and non-outbreaks. We tested our methods on synthetic data from a Susceptible-Infected-Recovered model for slowly changing, noisy disease dynamics. Outbreak sequences give a transcritical bifurcation within a specified future time window, whereas non-outbreak (null bifurcation) sequences do not. We identified incipient differences in time series of infectives leading to future outbreaks and non-outbreaks. These differences are reflected in 22 statistical features and 5 early warning signal indicators. Classifier performance, given by the area under the receiver-operating curve, ranged from 0.99 for large expanding windows of training data to 0.7 for small rolling windows. Real-world performances of classifiers were tested on two empirical datasets, COVID-19 data from Singapore and SARS data from Hong Kong, with two classifiers exhibiting high accuracy. In summary, we showed that there are statistical features that distinguish outbreak and non-outbreak sequences long before outbreaks occur. We could detect these differences in synthetic and real-world data sets, well before potential outbreaks occur.

Read more

4/16/2024