Verified authors shape X/Twitter discursive communities

2405.04896

YC

0

Reddit

0

Published 5/9/2024 by Stefano Guarino, Ayoub Mounim, Guido Caldarelli, Fabio Saracco

Abstract

Community detection algorithms try to extract a mesoscale structure from the available network data, generally avoiding any explicit assumption regarding the quantity and quality of information conveyed by specific sets of edges. In this paper, we show that the core of ideological/discursive communities on X/Twitter can be effectively identified by uncovering the most informative interactions in an authors-audience bipartite network through a maximum-entropy null model. The analysis is performed considering three X/Twitter datasets related to the main political events of 2022 in Italy, using as benchmarks four state-of-the-art algorithms - three descriptive, one inferential -, and manually annotating nearly 300 verified users based on their political affiliation. In terms of information content, the communities obtained with the entropy-based algorithm are comparable to those obtained with some of the benchmarks. However, such a methodology on the authors-audience bipartite network: uses just a small sample of the available data to identify the central users of each community; returns a neater partition of the user set in just a few, easy to interpret, communities; clusters well-known political figures in a way that better matches the political alliances when compared with the benchmarks. Our results provide an important insight into online debates, highlighting that online interaction networks are mostly shaped by the activity of a small set of users who enjoy public visibility even outside social media.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • The paper explores a novel approach to identifying the core of ideological/discursive communities on social media platforms like Twitter.
  • It uses a maximum-entropy null model to uncover the most informative interactions in an authors-audience bipartite network, rather than relying on explicit assumptions about the quantity and quality of information conveyed by specific sets of edges.
  • The analysis is performed on three Twitter datasets related to major political events in Italy in 2022, and the results are compared to four state-of-the-art community detection algorithms.

Plain English Explanation

Online discussions and debates often form distinct communities, where people with similar views or affiliations tend to interact more with each other. These community detection algorithms aim to identify these mesoscale structures from the available network data, without making assumptions about the importance of specific connections.

In this paper, the researchers show that the core of these ideological or discursive communities on Twitter can be effectively identified by focusing on the most informative interactions between authors and their audience. They use a maximum-entropy null model to uncover these key interactions in a bipartite network of authors and their followers.

The analysis is performed on three datasets related to major political events in Italy in 2022, and the results are compared to four state-of-the-art community detection algorithms, including both descriptive and inferential methods. The researchers also manually annotate nearly 300 verified users based on their political affiliation, which serves as a benchmark for evaluating the accuracy of the community detection.

The key findings are that the communities identified by the entropy-based approach are comparable in information content to those obtained with some of the benchmark algorithms. However, this method uses a smaller sample of the available data to identify the central users in each community, and it returns a cleaner partition of the user set into a few easy-to-interpret communities. Importantly, the political figures are clustered in a way that better matches their known alliances, compared to the other algorithms.

These results provide important insights into how online debates and discussions are shaped by the activity of a relatively small set of users who enjoy public visibility, even beyond their presence on social media platforms.

Technical Explanation

The paper presents a novel approach to community detection in social media networks, focusing on the authors-audience bipartite network to uncover the most informative interactions.

The researchers use a maximum-entropy null model to identify the core of ideological/discursive communities on Twitter, without making explicit assumptions about the quantity and quality of information conveyed by specific sets of edges. This is in contrast to traditional community detection algorithms, which often rely on such assumptions.

The analysis is performed on three Twitter datasets related to major political events in Italy in 2022, using four state-of-the-art community detection algorithms as benchmarks: three descriptive and one inferential. The researchers also manually annotate nearly 300 verified users based on their political affiliation to serve as a ground truth for evaluating the accuracy of the community detection.

The results show that the communities obtained with the entropy-based algorithm are comparable in terms of information content to those obtained with some of the benchmark algorithms. However, this methodology has several key advantages:

  1. It uses just a small sample of the available data to identify the central users of each community.
  2. It returns a neater partition of the user set into a few, easy-to-interpret communities.
  3. It clusters well-known political figures in a way that better matches their known political alliances, compared to the benchmark algorithms.

These findings suggest that online interaction networks are primarily shaped by the activity of a small set of users who enjoy public visibility, even outside of social media platforms. The paper provides important insights into the dynamics of online political debates and discussions.

Critical Analysis

The paper presents a compelling approach to community detection in social media networks, with a clear focus on uncovering the most informative interactions rather than relying on explicit assumptions about the importance of specific connections.

One potential limitation of the study is the use of manually annotated political affiliations as a benchmark for evaluating the accuracy of the community detection. While this provides a valuable ground truth, it may not capture the full complexity and nuance of political alignment, especially in the context of online discussions where individuals may express a range of views or affiliations.

Additionally, the paper does not explore the potential biases or limitations of the maximum-entropy null model used for the analysis. It would be valuable to understand how this approach might perform in comparison to other null models or community detection methods that make different assumptions about the underlying network structure.

Further research could also investigate the generalizability of the findings beyond the specific political events and datasets analyzed in this study. Examining the applicability of the entropy-based approach to other types of online discussions or social media platforms could provide additional insights into the broader patterns of community formation and evolution.

Overall, the paper makes a valuable contribution to the understanding of how online debates and discussions are shaped by the activity of a relatively small set of influential users. The novel application of the maximum-entropy null model to the authors-audience bipartite network represents an interesting methodological approach that could be further explored and refined in future research.

Conclusion

This paper presents a novel approach to identifying the core of ideological and discursive communities on social media platforms like Twitter. By using a maximum-entropy null model to uncover the most informative interactions in an authors-audience bipartite network, the researchers are able to effectively capture the central users and structure of these online communities.

The analysis, performed on three datasets related to major political events in Italy, shows that the communities identified by this entropy-based method are comparable in information content to those obtained with state-of-the-art community detection algorithms. However, the entropy-based approach has several advantages, including using a smaller sample of data, returning a neater partition of the user set, and clustering well-known political figures in a way that better matches their known alliances.

These findings provide important insights into the dynamics of online political debates and discussions, highlighting the disproportionate influence of a relatively small set of users who enjoy public visibility even outside of social media platforms. The [novel application of maximum-entropy models to the authors-audience network represents a promising direction for future research on community detection and the structure of online social interactions.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Political Leaning Inference through Plurinational Scenarios

Political Leaning Inference through Plurinational Scenarios

Joseba Fernandez de Landa, Rodrigo Agerri

YC

0

Reddit

0

Social media users express their political preferences via interaction with other users, by spontaneous declarations or by participation in communities within the network. This makes a social network such as Twitter a valuable data source to study computational science approaches to political learning inference. In this work we focus on three diverse regions in Spain (Basque Country, Catalonia and Galicia) to explore various methods for multi-party categorization, required to analyze evolving and complex political landscapes, and compare it with binary left-right approaches. We use a two-step method involving unsupervised user representations obtained from the retweets and their subsequent use for political leaning detection. Comprehensive experimentation on a newly collected and curated dataset comprising labeled users and their interactions demonstrate the effectiveness of using Relational Embeddings as representation method for political ideology detection in both binary and multi-party frameworks, even with limited training data. Finally, data visualization illustrates the ability of the Relational Embeddings to capture intricate intra-group and inter-group political affinities.

Read more

6/13/2024

🎲

Unveiling Online Conspiracy Theorists: a Text-Based Approach and Characterization

Alessandra Recordare, Guglielmo Cola, Tiziano Fagni, Maurizio Tesconi

YC

0

Reddit

0

In today's digital landscape, the proliferation of conspiracy theories within the disinformation ecosystem of online platforms represents a growing concern. This paper delves into the complexities of this phenomenon. We conducted a comprehensive analysis of two distinct X (formerly known as Twitter) datasets: one comprising users with conspiracy theorizing patterns and another made of users lacking such tendencies and thus serving as a control group. The distinguishing factors between these two groups are explored across three dimensions: emotions, idioms, and linguistic features. Our findings reveal marked differences in the lexicon and language adopted by conspiracy theorists with respect to other users. We developed a machine learning classifier capable of identifying users who propagate conspiracy theories based on a rich set of 871 features. The results demonstrate high accuracy, with an average F1 score of 0.88. Moreover, this paper unveils the most discriminating characteristics that define conspiracy theory propagators.

Read more

5/22/2024

Knowledge Graph Representation for Political Information Sources

Knowledge Graph Representation for Political Information Sources

Tinatin Osmonova, Alexey Tikhonov, Ivan P. Yamshchikov

YC

0

Reddit

0

With the rise of computational social science, many scholars utilize data analysis and natural language processing tools to analyze social media, news articles, and other accessible data sources for examining political and social discourse. Particularly, the study of the emergence of echo-chambers due to the dissemination of specific information has become a topic of interest in mixed methods research areas. In this paper, we analyze data collected from two news portals, Breitbart News (BN) and New York Times (NYT) to prove the hypothesis that the formation of echo-chambers can be partially explained on the level of an individual information consumption rather than a collective topology of individuals' social networks. Our research findings are presented through knowledge graphs, utilizing a dataset spanning 11.5 years gathered from BN and NYT media portals. We demonstrate that the application of knowledge representation techniques to the aforementioned news streams highlights, contrary to common assumptions, shows relative internal neutrality of both sources and polarizing attitude towards a small fraction of entities. Additionally, we argue that such characteristics in information sources lead to fundamental disparities in audience worldviews, potentially acting as a catalyst for the formation of echo-chambers.

Read more

4/5/2024

🔎

Community Detection for Heterogeneous Multiple Social Networks

Ziqing Zhu, Guan Yuan, Tao Zhou, Jiuxin Cao

YC

0

Reddit

0

The community plays a crucial role in understanding user behavior and network characteristics in social networks. Some users can use multiple social networks at once for a variety of objectives. These users are called overlapping users who bridge different social networks. Detecting communities across multiple social networks is vital for interaction mining, information diffusion, and behavior migration analysis among networks. This paper presents a community detection method based on nonnegative matrix tri-factorization for multiple heterogeneous social networks, which formulates a common consensus matrix to represent the global fused community. Specifically, the proposed method involves creating adjacency matrices based on network structure and content similarity, followed by alignment matrices which distinguish overlapping users in different social networks. With the generated alignment matrices, the method could enhance the fusion degree of the global community by detecting overlapping user communities across networks. The effectiveness of the proposed method is evaluated with new metrics on Twitter, Instagram, and Tumblr datasets. The results of the experiments demonstrate its superior performance in terms of community quality and community fusion.

Read more

5/8/2024