Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

Read original: arXiv:2405.01610 - Published 5/6/2024 by Noah Giebink, Amrita Gupta, Diogo Ver`issimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

Overview

This paper presents a method for automating the analysis of public sentiment and attitudes towards biodiversity using digital media data.
The researchers developed a framework to collect and analyze social media posts, news articles, and other online content related to biodiversity.
They used natural language processing techniques to extract information about the public's level of interest, sentiment, and concerns regarding biodiversity-related topics.

Plain English Explanation

The researchers in this study wanted to find a way to automatically understand how the general public feels and thinks about biodiversity. They recognized that people are increasingly sharing their thoughts and opinions about environmental issues online, through social media, news articles, and other digital content. By analyzing this wealth of digital data, the researchers believed they could gain valuable insights into the public's level of awareness, sentiment, and priorities when it comes to biodiversity.

To do this, the researchers created a framework that could collect and analyze relevant content from various online sources. [This connects to the work discussed in https://aimodels.fyi/papers/arxiv/social-media-artificial-intelligence-sustainable-cities-societies and https://aimodels.fyi/papers/arxiv/ecoverse-annotated-twitter-dataset-eco-relevance-classification.] They used natural language processing techniques to extract information about the public's attitudes, emotions, and concerns related to biodiversity. This allowed them to get a sense of which biodiversity-related topics were most salient and how people tended to feel about them.

The goal of this research was to develop a systematic way to monitor and understand public discourse around biodiversity, which could then inform conservation efforts and policy decisions. By tapping into the massive amounts of online data, the researchers aimed to gain a more comprehensive and up-to-date picture of the public's perspectives on this important environmental issue.

Technical Explanation

The researchers developed a framework for collecting and analyzing digital media content related to biodiversity. They focused on three main data sources: social media posts, news articles, and other online text-based content.

To collect the data, they used a combination of keyword searches and machine learning-based topic modeling to identify relevant content. [This ties into the work described in https://aimodels.fyi/papers/arxiv/named-entity-recognition-topic-modeling-based-solution.] The researchers then applied natural language processing techniques, such as sentiment analysis and named entity recognition, to extract information about the public's attitudes, emotions, and concerns towards biodiversity.

By analyzing patterns in the extracted data, the researchers were able to identify the most salient biodiversity-related topics and understand how the public feels about them. They also explored how these attitudes and sentiments varied across different demographic groups, geographic regions, and time periods.

The researchers tested their framework on a large dataset of online content, demonstrating its effectiveness in automating the analysis of public discourse around biodiversity. This approach could be valuable for conservation organizations, policymakers, and researchers who need to stay informed about the evolving public perceptions and priorities related to biodiversity.

Critical Analysis

The researchers acknowledge several limitations and areas for further research. For example, they note that their analysis was limited to text-based content and did not include multimedia formats, such as images and videos, which could also contain valuable insights about public attitudes towards biodiversity.

Additionally, the researchers highlight the challenge of determining the reliability and representativeness of the online data they collected. [This connects to the discussions in https://aimodels.fyi/papers/arxiv/z-agi-labs-at-climateactivism-2024-stance and https://aimodels.fyi/papers/arxiv/surveying-attitudinal-alignment-between-large-language-models.] There may be biases or limitations in who is actively participating in online discussions about biodiversity, which could skew the findings.

The researchers also suggest that future studies could explore the use of multimodal analysis techniques, combining text, images, and other data sources, to gain a more comprehensive understanding of public saliency and attitudes towards biodiversity. Incorporating additional contextual information, such as demographic data and behavioral insights, could also help refine the analysis and provide more nuanced interpretations.

Conclusion

This study presents a promising approach for automating the analysis of public saliency and attitudes towards biodiversity using digital media data. By tapping into the wealth of online content, the researchers were able to develop a framework for systematically monitoring and understanding the evolving public discourse around this important environmental issue.

The insights gained from this type of analysis could be invaluable for conservation organizations, policymakers, and researchers who need to stay informed about the public's priorities and concerns related to biodiversity. As digital media continues to play an increasingly prominent role in shaping public opinion, this automated approach could become an essential tool for informing and guiding biodiversity conservation efforts.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

Noah Giebink, Amrita Gupta, Diogo Ver`issimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned from queries are often cluttered with irrelevant content and syndicated articles. We aim to overcome these challenges by leveraging modern Natural Language Processing (NLP) tools. We introduce a folk taxonomy approach for improved search term generation and employ cosine similarity on Term Frequency-Inverse Document Frequency vectors to filter syndicated articles. We also introduce an extensible relevance filtering pipeline which uses unsupervised learning to reveal common topics, followed by an open-source zero-shot Large Language Model (LLM) to assign topics to news article titles, which are then used to assign relevance. Finally, we conduct sentiment, topic, and volume analyses on resulting data. We illustrate our methodology with a case study of news and X (formerly Twitter) data before and during the COVID-19 pandemic for various mammal taxa, including bats, pangolins, elephants, and gorillas. During the data collection period, up to 62% of articles including keywords pertaining to bats were deemed irrelevant to biodiversity, underscoring the importance of relevance filtering. At the pandemic's onset, we observed increased volume and a significant sentiment shift toward horseshoe bats, which were implicated in the pandemic, but not for other focal taxa. The proposed methods open the door to conservation practitioners applying modern and emerging NLP tools, including LLMs out of the box, to analyze public perceptions of biodiversity during current events or campaigns.

5/6/2024

✨

Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case

Muhammad Asif Auyb, Muhammad Tayyab Zamir, Imran Khan, Hannia Naseem, Nasir Ahmad, Kashif Ahmad

This paper focuses on a very important societal challenge of water quality analysis. Being one of the key factors in the economic and social development of society, the provision of water and ensuring its quality has always remained one of the top priorities of public authorities. To ensure the quality of water, different methods for monitoring and assessing the water networks, such as offline and online surveys, are used. However, these surveys have several limitations, such as the limited number of participants and low frequency due to the labor involved in conducting such surveys. In this paper, we propose a Natural Language Processing (NLP) framework to automatically collect and analyze water-related posts from social media for data-driven decisions. The proposed framework is composed of two components, namely (i) text classification, and (ii) topic modeling. For text classification, we propose a merit-fusion-based framework incorporating several Large Language Models (LLMs) where different weight selection and optimization methods are employed to assign weights to the LLMs. In topic modeling, we employed the BERTopic library to discover the hidden topic patterns in the water-related tweets. We also analyzed relevant tweets originating from different regions and countries to explore global, regional, and country-specific issues and water-related concerns. We also collected and manually annotated a large-scale dataset, which is expected to facilitate future research on the topic.

4/24/2024

👨‍🏫

Connecting the Dots in News Analysis: Bridging the Cross-Disciplinary Disparities in Media Bias and Framing

Gisela Vallejo, Timothy Baldwin, Lea Frermann

The manifestation and effect of bias in news reporting have been central topics in the social sciences for decades, and have received increasing attention in the NLP community recently. While NLP can help to scale up analyses or contribute automatic procedures to investigate the impact of biased news in society, we argue that methodologies that are currently dominant fall short of addressing the complex questions and effects addressed in theoretical media studies. In this survey paper, we review social science approaches and draw a comparison with typical task formulations, methods, and evaluation metrics used in the analysis of media bias in NLP. We discuss open questions and suggest possible directions to close identified gaps between theory and predictive models, and their evaluation. These include model transparency, considering document-external information, and cross-document reasoning rather than single-label assignment.

6/21/2024

A Flexible and Scalable Approach for Collecting Wildlife Advertisements on the Web

Juliana Barbosa, Sunandan Chakraborty, Juliana Freire

Wildlife traffickers are increasingly carrying out their activities in cyberspace. As they advertise and sell wildlife products in online marketplaces, they leave digital traces of their activity. This creates a new opportunity: by analyzing these traces, we can obtain insights into how trafficking networks work as well as how they can be disrupted. However, collecting such information is difficult. Online marketplaces sell a very large number of products and identifying ads that actually involve wildlife is a complex task that is hard to automate. Furthermore, given that the volume of data is staggering, we need scalable mechanisms to acquire, filter, and store the ads, as well as to make them available for analysis. In this paper, we present a new approach to collect wildlife trafficking data at scale. We propose a data collection pipeline that combines scoped crawlers for data discovery and acquisition with foundational models and machine learning classifiers to identify relevant ads. We describe a dataset we created using this pipeline which is, to the best of our knowledge, the largest of its kind: it contains almost a million ads obtained from 41 marketplaces, covering 235 species and 20 languages. The source code is publicly available at url{https://github.com/VIDA-NYU/wildlife_pipeline}.

7/29/2024