Bans vs. Warning Labels: Examining Support for Community-wide Moderation Interventions

Read original: arXiv:2307.11880 - Published 9/10/2024 by Shagun Jhaver
Total Score

0

📉

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This study explores how users perceive content moderation interventions on social media platforms like Facebook and Reddit.
  • It focuses on user perceptions of banning entire communities that frequently violate platform policies, versus adding warning labels to those communities.
  • The study examines how factors like presumed effects on others and support for free speech influence user approval of these interventions.
  • It also analyzes user concerns about these moderation approaches, such as their potential to reinforce inappropriate behaviors.

Plain English Explanation

Social media platforms often have to decide how to moderate online communities that frequently post hateful, violent, or sexually explicit content. This study looks at how users feel about two different approaches:

  1. Banning the whole community: Removing all posts from a community that violates platform policies.
  2. Warning labels: Showing an interstitial warning before people can access a community that violates policies.

The researchers wanted to understand how factors like how people think the moderation will affect others and beliefs about free speech influence whether users approve of these approaches.

They also looked at what concerns users have about these moderation methods, like the risk of reinforcing the bad behavior in the communities being moderated.

Technical Explanation

The researchers conducted a pre-registered survey of U.S. participants to explore user perceptions of content moderation for online communities that frequently feature hate speech, violent content, and sexually explicit material.

They tested two community-wide moderation interventions:

  1. Community bans: Removing all posts from a community.
  2. Community warning labels: Showing an interstitial warning before people can access a community.

The survey examined how third-person effects (perceptions of how the interventions will affect others) and support for free speech influenced user approval of these interventions.

Regression analyses showed that presumed effects on others was a significant predictor of support for both interventions, while free speech beliefs significantly influenced participants' inclination for using warning labels.

Qualitative analysis of open-ended responses found that community-wide bans were often seen as too heavy-handed, and users preferred sanctions proportional to the severity and type of infractions. Concerns were also raised about how norm-violating communities could reinforce inappropriate behaviors.

Critical Analysis

The study provides valuable insights into user perceptions of content moderation approaches, which can inform the design of more effective and acceptable policies. However, it has some limitations:

  • The survey was conducted in the U.S. only, so the findings may not generalize to other cultural contexts.
  • The study focused on high-level moderation approaches, but users may have more nuanced views on specific implementation details.
  • The presumed effects on others and free speech measures are self-reported, which may not fully capture these complex psychological factors.

Additional research could explore how user perceptions vary across different types of online communities and content, as well as investigate the long-term effects of moderation interventions on user behavior and community dynamics. Incorporating more behavioral data and user feedback could also provide a richer understanding of these issues.

Conclusion

This study sheds light on how users perceive different approaches to moderating online communities that frequently violate platform policies. It highlights the importance of considering factors like presumed effects on others and support for free speech when designing content moderation systems.

The findings suggest that a more nuanced, proportional approach to moderation may be preferred by users, rather than blunt community-wide bans. Addressing concerns about the potential for moderation to reinforce inappropriate behaviors will also be crucial.

Overall, this research contributes to the ongoing discussion around decentralized moderation and the development of more effective and acceptable content moderation strategies for online platforms.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📉

Total Score

0

Bans vs. Warning Labels: Examining Support for Community-wide Moderation Interventions

Shagun Jhaver

Social media platforms like Facebook and Reddit host thousands of user-governed online communities. These platforms sanction communities that frequently violate platform policies; however, public perceptions of such sanctions remain unclear. In a pre-registered survey conducted in the US, I explore bystander perceptions of content moderation for communities that frequently feature hate speech, violent content, and sexually explicit content. Two community-wide moderation interventions are tested: (1) community bans, where all community posts are removed, and (2) community warning labels, where an interstitial warning label precedes access. I examine how third-person effects and support for free speech influence user approval of these interventions on any platform. My regression analyses show that presumed effects on others are a significant predictor of backing for both interventions, while free speech beliefs significantly influence participants' inclination for using warning labels. Analyzing the open-ended responses, I find that community-wide bans are often perceived as too coarse, and users instead value sanctions in proportion to the severity and type of infractions. I report on concerns that norm-violating communities could reinforce inappropriate behaviors and show how users' choice of sanctions is influenced by their perceived effectiveness. I discuss the implications of these results for HCI research on online harms and content moderation.

Read more

9/10/2024

Total Score

0

Beyond Trial-and-Error: Predicting User Abandonment After a Moderation Intervention

Benedetta Tessa, Lorenzo Cima, Amaury Trujillo, Marco Avvenuti, Stefano Cresci

Current content moderation practices follow the trial-and-error approach, meaning that moderators apply sequences of interventions until they obtain the desired outcome. However, being able to preemptively estimate the effects of an intervention would allow moderators the unprecedented opportunity to plan their actions ahead of application. As a first step towards this goal, here we propose and tackle the novel task of predicting the effect of a moderation intervention. We study the reactions of 16,540 users to a massive ban of online communities on Reddit, training a set of binary classifiers to identify those users who would abandon the platform after the intervention - a problem of great practical relevance. We leverage a dataset of 13.8M posts to compute a large and diverse set of 142 features, which convey information about the activity, toxicity, relations, and writing style of the users. We obtain promising results, with the best-performing model achieving micro F1 = 0.800 and macro F1 = 0.676. Our model demonstrates robust generalizability when applied to users from previously unseen communities. Furthermore, we identify activity features as the most informative predictors, followed by relational and toxicity features, while writing style features exhibit limited utility. Our results demonstrate the feasibility of predicting the effects of a moderation intervention, paving the way for a new research direction in predictive content moderation aimed at empowering moderators with intelligent tools to plan ahead their actions.

Read more

4/30/2024

The Great Ban: Efficacy and Unintended Consequences of a Massive Deplatforming Operation on Reddit
Total Score

0

The Great Ban: Efficacy and Unintended Consequences of a Massive Deplatforming Operation on Reddit

Lorenzo Cima, Amaury Trujillo, Marco Avvenuti, Stefano Cresci

In the current landscape of online abuses and harms, effective content moderation is necessary to cultivate safe and inclusive online spaces. Yet, the effectiveness of many moderation interventions is still unclear. Here, we assess the effectiveness of The Great Ban, a massive deplatforming operation that affected nearly 2,000 communities on Reddit. By analyzing 16M comments posted by 17K users during 14 months, we provide nuanced results on the effects, both desired and otherwise, of the ban. Among our main findings is that 15.6% of the affected users left Reddit and that those who remained reduced their toxicity by 6.6% on average. The ban also caused 5% users to increase their toxicity by more than 70% of their pre-ban level. Overall, our multifaceted results provide new insights into the efficacy of deplatforming. As such, our findings can inform the development of future moderation interventions and the policing of online platforms.

Read more

5/29/2024

🏅

Total Score

0

Community Guidelines Make this the Best Party on the Internet: An In-Depth Study of Online Platforms' Content Moderation Policies

Brennan Schaffner, Arjun Nitin Bhagoji, Siyuan Cheng, Jacqueline Mei, Jay L. Shen, Grace Wang, Marshini Chetty, Nick Feamster, Genevieve Lakier, Chenhao Tan

Moderating user-generated content on online platforms is crucial for balancing user safety and freedom of speech. Particularly in the United States, platforms are not subject to legal constraints prescribing permissible content. Each platform has thus developed bespoke content moderation policies, but there is little work towards a comparative understanding of these policies across platforms and topics. This paper presents the first systematic study of these policies from the 43 largest online platforms hosting user-generated content, focusing on policies around copyright infringement, harmful speech, and misleading content. We build a custom web-scraper to obtain policy text and develop a unified annotation scheme to analyze the text for the presence of critical components. We find significant structural and compositional variation in policies across topics and platforms, with some variation attributable to disparate legal groundings. We lay the groundwork for future studies of ever-evolving content moderation policies and their impact on users.

Read more

5/9/2024