A Study on Scaling Up Multilingual News Framing Analysis

2404.01481

YC

0

Reddit

0

Published 4/3/2024 by Syeda Sabrina Akter, Antonios Anastasopoulos
A Study on Scaling Up Multilingual News Framing Analysis

Abstract

Media framing is the study of strategically selecting and presenting specific aspects of political issues to shape public opinion. Despite its relevance to almost all societies around the world, research has been limited due to the lack of available datasets and other resources. This study explores the possibility of dataset creation through crowdsourcing, utilizing non-expert annotators to develop training corpora. We first extend framing analysis beyond English news to a multilingual context (12 typologically diverse languages) through automatic translation. We also present a novel benchmark in Bengali and Portuguese on the immigration and same-sex marriage domains. Additionally, we show that a system trained on our crowd-sourced dataset, combined with other existing ones, leads to a 5.32 percentage point increase from the baseline, showing that crowdsourcing is a viable option. Last, we study the performance of large language models (LLMs) for this task, finding that task-specific fine-tuning is a better approach than employing bigger non-specialized models.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper explores scaling up the analysis of news article framing across multiple languages.
  • It focuses on developing a dataset and methods for multilingual news framing analysis at scale.
  • The goal is to enable deeper understanding of how news stories are framed differently across languages and cultures.

Plain English Explanation

News articles don't just report facts - they also frame stories in particular ways. For example, an article about a political event might emphasize certain details or perspectives over others, shaping how readers understand the issue. This framing can vary significantly across different languages and cultural contexts.

The researchers in this study wanted to develop better tools to analyze news framing at a large scale, across many different languages. They built a dataset of news articles in 9 languages, along with human-annotated labels about the framing used in each article. This dataset can be used to train machine learning models to automatically detect framing patterns.

Being able to analyze framing at scale in multiple languages is important because it can reveal biases and differences in how the same events are portrayed around the world. This could help media consumers be more aware of potential biases, and media organizations be more conscious of how they frame stories. Ultimately, this research aims to promote greater transparency and cross-cultural understanding through multilingual news analysis.

Technical Explanation

The key elements of this research are:

  1. Dataset Creation: The researchers compiled a dataset of over 18,000 news articles in 9 languages (English, Spanish, German, French, Italian, Polish, Russian, Arabic, and Chinese). Each article was annotated by human raters for the framing used, based on a taxonomy of 15 common framing dimensions.

  2. Framing Detection Models: Using the annotated dataset, the researchers trained machine learning models to automatically detect the framing dimensions present in news articles. They explored different model architectures and techniques to handle the multilingual, multi-label nature of the task.

  3. Cross-lingual Framing Analysis: The trained models were used to analyze patterns in how news stories are framed across languages. The researchers identified common framing dimensions that tend to co-occur, as well as differences in framing between language groups.

The insights from this work can enable deeper, more systematic analysis of how media outlets around the world portray events and issues differently. This has important implications for media literacy, journalism ethics, and cross-cultural understanding.

Critical Analysis

The researchers acknowledge several limitations and areas for further work:

  • The dataset, while large, may not fully capture the diversity of news framing, especially for less common framing dimensions. Expanding the dataset further could improve model performance.
  • The framing taxonomy used was developed for English news, and may not perfectly translate to all languages. Refining the taxonomy for multilingual use could yield better annotations.
  • The study focused on broad, high-level framing patterns. Analyzing more granular, article-specific framing would require additional annotation efforts.
  • While the models demonstrated reasonable performance, there is likely room for improvement through more advanced neural architectures or multilingual techniques.

Overall, this research represents an important step towards scalable, multilingual analysis of news framing. Continued work in this area could lead to transformative tools for media analysis and greater cross-cultural understanding.

Conclusion

This paper presents a novel approach for scaling up the analysis of news framing across multiple languages. By developing a large, annotated dataset and training machine learning models to detect framing patterns, the researchers have laid the groundwork for more systematic, cross-cultural investigations of media bias and narrative construction.

The ability to analyze how the same events are portrayed differently around the world has significant implications. It can promote media literacy, inform journalistic ethics, and foster greater cross-cultural empathy and understanding. While further refinements are needed, this research represents an important step towards unlocking the potential of multilingual news analysis.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

šŸ’¬

Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Detection

Valeria Pastorino, Jasivan A. Sivakumar, Nafise Sadat Moosavi

YC

0

Reddit

0

Previous studies on framing have relied on manual analysis or fine-tuning models with limited annotated datasets. However, pre-trained models, with their diverse training backgrounds, offer a promising alternative. This paper presents a comprehensive analysis of GPT-4, GPT-3.5 Turbo, and FLAN-T5 models in detecting framing in news headlines. We evaluated these models in various scenarios: zero-shot, few-shot with in-domain examples, cross-domain examples, and settings where models explain their predictions. Our results show that explainable predictions lead to more reliable outcomes. GPT-4 performed exceptionally well in few-shot settings but often misinterpreted emotional language as framing, highlighting a significant challenge. Additionally, the results suggest that consistent predictions across multiple models could help identify potential annotation inaccuracies in datasets. Finally, we propose a new small dataset for real-world evaluation on headlines from a diverse set of topics.

Read more

6/18/2024

šŸ‹ļø

A Multilingual Similarity Dataset for News Article Frame

Xi Chen, Mattia Samory, Scott Hale, David Jurgens, Przemyslaw A. Grabowicz

YC

0

Reddit

0

Understanding the writing frame of news articles is vital for addressing social issues, and thus has attracted notable attention in the fields of communication studies. Yet, assessing such news article frames remains a challenge due to the absence of a concrete and unified standard dataset that considers the comprehensive nuances within news content. To address this gap, we introduce an extended version of a large labeled news article dataset with 16,687 new labeled pairs. Leveraging the pairwise comparison of news articles, our method frees the work of manual identification of frame classes in traditional news frame analysis studies. Overall we introduce the most extensive cross-lingual news article similarity dataset available to date with 26,555 labeled news article pairs across 10 languages. Each data point has been meticulously annotated according to a codebook detailing eight critical aspects of news content, under a human-in-the-loop framework. Application examples demonstrate its potential in unearthing country communities within global news coverage, exposing media bias among news outlets, and quantifying the factors related to news creation. We envision that this news similarity dataset will broaden our understanding of the media ecosystem in terms of news coverage of events and perspectives across countries, locations, languages, and other social constructs. By doing so, it can catalyze advancements in social science research and applied methodologies, thereby exerting a profound impact on our society.

Read more

5/24/2024

šŸ‘Øā€šŸ«

Connecting the Dots in News Analysis: Bridging the Cross-Disciplinary Disparities in Media Bias and Framing

Gisela Vallejo, Timothy Baldwin, Lea Frermann

YC

0

Reddit

0

The manifestation and effect of bias in news reporting have been central topics in the social sciences for decades, and have received increasing attention in the NLP community recently. While NLP can help to scale up analyses or contribute automatic procedures to investigate the impact of biased news in society, we argue that methodologies that are currently dominant fall short of addressing the complex questions and effects addressed in theoretical media studies. In this survey paper, we review social science approaches and draw a comparison with typical task formulations, methods, and evaluation metrics used in the analysis of media bias in NLP. We discuss open questions and suggest possible directions to close identified gaps between theory and predictive models, and their evaluation. These include model transparency, considering document-external information, and cross-document reasoning rather than single-label assignment.

Read more

6/21/2024

Evaluating the Ability of Computationally Extracted Narrative Maps to Encode Media Framing

Evaluating the Ability of Computationally Extracted Narrative Maps to Encode Media Framing

Sebasti'an Concha Mac'ias, Brian Keith Norambuena

YC

0

Reddit

0

Narratives serve as fundamental frameworks in our understanding of the world and play a crucial role in collaborative sensemaking, providing a versatile foundation for sensemaking. Framing is a subtle yet potent mechanism that influences public perception through specific word choices, shaping interpretations of reported news events. Despite the recognized importance of narratives and framing, a significant gap exists in the literature with regard to the explicit consideration of framing within the context of computational extraction and representation. This article explores the capabilities of a specific narrative extraction and representation approach -- narrative maps -- to capture framing information from news data. The research addresses two key questions: (1) Does the narrative extraction method capture the framing distribution of the data set? (2) Does it produce a representation with consistent framing? Our results indicate that while the algorithm captures framing distributions, achieving consistent framing across various starting and ending events poses challenges. Our results highlight the potential of narrative maps to provide users with insights into the intricate framing dynamics within news narratives. However, we note that directly leveraging framing information in the computational narrative extraction process remains an open challenge.

Read more

5/7/2024