Extracting chemical food safety hazards from the scientific literature automatically using large language models

Read original: arXiv:2405.15787 - Published 5/28/2024 by Neris Ozen, Wenjuan Mu, Esther D. van Asselt, Leonieke M. van den Bulk

💬

Overview

The number of scientific articles on food safety has been increasing over the past few decades, making it challenging for experts to keep up with the latest findings.
This study explores using large language models to automatically extract information about chemical hazards from scientific literature, without requiring specialized training or extensive computing resources.
The researchers tested different prompting strategies to optimize the performance of the language model on this task, focusing on food categories like leafy greens, shellfish, dairy, maize, and salmon.

Plain English Explanation

There has been a growing amount of scientific research published on food safety in recent years. This makes it difficult for experts who work in food safety to stay informed about the latest findings and potential hazards in the food supply.

To address this challenge, the researchers in this study used a large language model - a type of AI system that can understand and generate human-like text. They tested different ways of asking the language model questions to see which approach worked best for extracting information about chemical contaminants in various food types, such as leafy greens, shellfish, dairy, maize, and salmon.

The key insight was that the specific wording of the questions, or "prompts," used to query the language model had a significant impact on how well it could identify relevant chemical hazards. A prompt that broke the task down into smaller steps performed the best overall, reaching an average accuracy of 93%. This suggests that large language models can be very useful tools for automatically gathering important food safety information from the scientific literature.

Technical Explanation

The researchers used a pre-trained large language model, without requiring any additional specialized training or access to a large computing cluster. They experimented with three different prompting strategies to see which one would work best for extracting information about chemical hazards from scientific abstracts related to food safety.

The prompts were optimized and validated using abstracts related to two food categories - leafy greens and shellfish. The performance of the best prompt was then evaluated on three additional test foods: dairy, maize, and salmon.

The results showed that the specific wording of the prompt had a significant impact on the language model's ability to accurately identify relevant chemical contaminants. A prompt that broke the task down into smaller steps, providing more guidance to the model, performed the best overall with an average accuracy of 93%.

This high level of performance validates the potential of large language models for automating the extraction of important food safety information from the growing body of scientific literature. The extracted hazards were also found to align with those already included in food monitoring programs, further confirming the relevance of the model's outputs.

Critical Analysis

The researchers acknowledged that their approach has some limitations. For example, they only evaluated the model's performance on scientific abstracts, not full research papers, which may contain more detailed information. Additionally, the test food categories were relatively narrow, and the model's performance may vary for a broader range of food types.

Another potential concern is the reliability and potential biases of large language models, which can sometimes produce inaccurate or harmful outputs. The researchers did not address these issues in depth, and further research may be needed to ensure the safety and trustworthiness of using such models for critical food safety applications.

Furthermore, the study did not explore the integration of domain-specific chemical knowledge into the language model, which could potentially enhance its performance and reliability for this task. [Evaluating the model's performance on a more diverse, multilingual dataset could also provide valuable insights.

Conclusion

This study demonstrates the potential of using large language models to automate the extraction of relevant food safety information from the scientific literature. By optimizing the prompting strategy, the researchers were able to achieve high accuracy in identifying chemical hazards across several food categories.

The findings suggest that this approach could be a valuable tool for food safety experts, enabling them to efficiently stay informed about the latest research and potential threats in the food supply. However, further research is needed to address the limitations and ensure the reliability and safety of such systems for real-world food safety applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Extracting chemical food safety hazards from the scientific literature automatically using large language models

Neris Ozen, Wenjuan Mu, Esther D. van Asselt, Leonieke M. van den Bulk

The number of scientific articles published in the domain of food safety has consistently been increasing over the last few decades. It has therefore become unfeasible for food safety experts to read all relevant literature related to food safety and the occurrence of hazards in the food chain. However, it is important that food safety experts are aware of the newest findings and can access this information in an easy and concise way. In this study, an approach is presented to automate the extraction of chemical hazards from the scientific literature through large language models. The large language model was used out-of-the-box and applied on scientific abstracts; no extra training of the models or a large computing cluster was required. Three different styles of prompting the model were tested to assess which was the most optimal for the task at hand. The prompts were optimized with two validation foods (leafy greens and shellfish) and the final performance of the best prompt was evaluated using three test foods (dairy, maize and salmon). The specific wording of the prompt was found to have a considerable effect on the results. A prompt breaking the task down into smaller steps performed best overall. This prompt reached an average accuracy of 93% and contained many chemical contaminants already included in food monitoring programs, validating the successful retrieval of relevant hazards for the food safety domain. The results showcase how valuable large language models can be for the task of automatic information extraction from the scientific literature.

5/28/2024

💬

Enhancing Food Safety in Supply Chains: The Potential Role of Large Language Models in Preventing Campylobacter Contamination

Asaf Tzachor

Foodborne diseases pose a significant global public health challenge, primarily driven by bacterial infections. Among these, Campylobacter spp. is notable, causing over 95 million cases annually. In response, the Hazard Analysis and Critical Control Points (HACCP) system, a food safety management framework, has been developed and is considered the most effective approach for systematically managing foodborne safety risks, including the prevention of bacterial contaminations, throughout the supply chain. Despite its efficacy, the adoption of HACCP is often incomplete across different sectors of the food industry. This limited implementation can be attributed to factors such as a lack of awareness, complex guidelines, confusing terminology, and insufficient training on the HACCP system's implementation. This study explores the potential of large language models (LLMs), specifically generative pre-trained transformers (GPTs), to mitigate Campylobacter contamination across four typical stages of the supply chain: primary production, food processing, distribution and retail, and preparation and consumption. While the interaction between LLMs and food safety presents a promising potential, it remains largely underexplored. To demonstrate the possible applications of LLMs in this domain, we further configure an open-access customized GPT trained on the FAO's HACCP toolbox and the 12 steps of HACCP implementation, and test it in the context of commercial food preparation. The study also considers critical barriers to implementing GPTs at each step of the supply chain and proposes initial measures to overcome these obstacles.

6/11/2024

💬

New!Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes

Luis Rita, Josh Southern, Ivan Laponogov, Kyle Higgins, Kirill Veselkov

In the emerging field of computational gastronomy, aligning culinary practices with scientifically supported nutritional goals is increasingly important. This study explores how large language models (LLMs) can be applied to optimize ingredient substitutions in recipes, specifically to enhance the phytochemical content of meals. Phytochemicals are bioactive compounds found in plants, which, based on preclinical studies, may offer potential health benefits. We fine-tuned models, including OpenAI's GPT-3.5, DaVinci, and Meta's TinyLlama, using an ingredient substitution dataset. These models were used to predict substitutions that enhance phytochemical content and create a corresponding enriched recipe dataset. Our approach improved Hit@1 accuracy on ingredient substitution tasks, from the baseline 34.53 plus-minus 0.10% to 38.03 plus-minus 0.28% on the original GISMo dataset, and from 40.24 plus-minus 0.36% to 54.46 plus-minus 0.29% on a refined version of the same dataset. These substitutions led to the creation of 1,951 phytochemically enriched ingredient pairings and 1,639 unique recipes. While this approach demonstrates potential in optimizing ingredient substitutions, caution must be taken when drawing conclusions about health benefits, as the claims are based on preclinical evidence. Future work should include clinical validation and broader datasets to further evaluate the nutritional impact of these substitutions. This research represents a step forward in using AI to promote healthier eating practices, providing potential pathways for integrating computational methods with nutritional science.

9/16/2024

💬

Are large language models superhuman chemists?

Adrian Mirza, Nawaf Alampara, Sreekanth Kunchapu, Benedict Emoekabu, Aswanth Krishnan, Mara Wilhelmi, Macjonathan Okereke, Juliane Eberhardt, Amir Mohammad Elahi, Maximilian Greiner, Caroline T. Holick, Tanya Gupta, Mehrdad Asgari, Christina Glaubitz, Lea C. Klepsch, Yannik Koster, Jakob Meyer, Santiago Miret, Tim Hoffmann, Fabian Alexander Kreth, Michael Ringleb, Nicole Roesner, Ulrich S. Schubert, Leanne M. Stafast, Dinga Wonanke, Michael Pieler, Philippe Schwaller, Kevin Maik Jablonka

Large language models (LLMs) have gained widespread interest due to their ability to process human language and perform tasks on which they have not been explicitly trained. This is relevant for the chemical sciences, which face the problem of small and diverse datasets that are frequently in the form of text. LLMs have shown promise in addressing these issues and are increasingly being harnessed to predict chemical properties, optimize reactions, and even design and conduct experiments autonomously. However, we still have only a very limited systematic understanding of the chemical reasoning capabilities of LLMs, which would be required to improve models and mitigate potential harms. Here, we introduce ChemBench, an automated framework designed to rigorously evaluate the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of human chemists. We curated more than 7,000 question-answer pairs for a wide array of subfields of the chemical sciences, evaluated leading open and closed-source LLMs, and found that the best models outperformed the best human chemists in our study on average. The models, however, struggle with some chemical reasoning tasks that are easy for human experts and provide overconfident, misleading predictions, such as about chemicals' safety profiles. These findings underscore the dual reality that, although LLMs demonstrate remarkable proficiency in chemical tasks, further research is critical to enhancing their safety and utility in chemical sciences. Our findings also indicate a need for adaptations to chemistry curricula and highlight the importance of continuing to develop evaluation frameworks to improve safe and useful LLMs.

4/3/2024