Promoting Constructive Deliberation: Reframing for Receptiveness

Read original: arXiv:2405.15067 - Published 6/26/2024 by Gauri Kambhatla, Matthew Lease, Ashwin Rajadesingan
Total Score

0

Promoting Constructive Deliberation: Reframing for Receptiveness

Sign in to get full access

or

If you already have an account, we'll log you in



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Promoting Constructive Deliberation: Reframing for Receptiveness
Total Score

0

Promoting Constructive Deliberation: Reframing for Receptiveness

Gauri Kambhatla, Matthew Lease, Ashwin Rajadesingan

To promote constructive discussion of controversial topics online, we propose automatic reframing of disagreeing responses to signal receptiveness to a preceding comment. Drawing on research from psychology, communications, and linguistics, we identify six strategies for reframing. We automatically reframe replies to comments according to each strategy, using a Reddit dataset. Through human-centered experiments, we find that the replies generated with our framework are perceived to be significantly more receptive than the original replies and a generic receptiveness baseline. We illustrate how transforming receptiveness, a particular social science construct, into a computational framework, can make LLM generations more aligned with human perceptions. We analyze and discuss the implications of our results, and highlight how a tool based on our framework might be used for more teachable and creative content moderation.

Read more

6/26/2024

Change My Frame: Reframing in the Wild in r/ChangeMyView
Total Score

0

Change My Frame: Reframing in the Wild in r/ChangeMyView

Arturo Mart'inez Peguero, Taro Watanabe

Recent work in reframing, within the scope of text style transfer, has so far made use of out-of-context, task-prompted utterances in order to produce neutralizing or optimistic reframes. Our work aims to generalize reframing based on the subreddit r/ChangeMyView (CMV). We build a dataset that leverages CMV's community's interactions and conventions to identify high-value, community-recognized utterances that produce changes of perspective. With this data, we widen the scope of the direction of reframing since the changes in perspective do not only occur in neutral or positive directions. We fine tune transformer-based models, make use of a modern LLM to refine our dataset, and explore challenges in the dataset creation and evaluation around this type of reframing.

Read more

7/4/2024

👁️

Total Score

0

Rethinking harmless refusals when fine-tuning foundation models

Florin Pop, Judd Rosenblatt, Diogo Schwerz de Lucena, Michael Vaiana

In this paper, we investigate the degree to which fine-tuning in Large Language Models (LLMs) effectively mitigates versus merely conceals undesirable behavior. Through the lens of semi-realistic role-playing exercises designed to elicit such behaviors, we explore the response dynamics of LLMs post fine-tuning interventions. Our methodology involves prompting models for Chain-of-Thought (CoT) reasoning and analyzing the coherence between the reasoning traces and the resultant outputs. Notably, we identify a pervasive phenomenon we term emph{reason-based deception}, where models either stop producing reasoning traces or produce seemingly ethical reasoning traces that belie the unethical nature of their final outputs. We further examine the efficacy of response strategies (polite refusal versus explicit rebuttal) in curbing the occurrence of undesired behavior in subsequent outputs of multi-turn interactions. Our findings reveal that explicit rebuttals significantly outperform polite refusals in preventing the continuation of undesired outputs and nearly eliminate reason-based deception, challenging current practices in model fine-tuning. Accordingly, the two key contributions of this paper are (1) defining and studying reason-based deception, a new type of hidden behavior, and (2) demonstrating that rebuttals provide a more robust response model to harmful requests than refusals, thereby highlighting the need to reconsider the response strategies in fine-tuning approaches.

Read more

7/1/2024

Re-Ranking News Comments by Constructiveness and Curiosity Significantly Increases Perceived Respect, Trustworthiness, and Interest
Total Score

0

Re-Ranking News Comments by Constructiveness and Curiosity Significantly Increases Perceived Respect, Trustworthiness, and Interest

Emily Saltz, Zaria Jalan, Tin Acosta

Online commenting platforms have commonly developed systems to address online harms by removing and down-ranking content. An alternative, under-explored approach is to focus on up-ranking content to proactively prioritize prosocial commentary and set better conversational norms. We present a study with 460 English-speaking US-based news readers to understand the effects of re-ranking comments by constructiveness, curiosity, and personal stories on a variety of outcomes related to willingness to participate and engage, as well as perceived credibility and polarization in a comment section. In our rich-media survey experiment, participants across these four ranking conditions and a control group reviewed prototypes of comment sections of a Politics op-ed and Dining article. We found that outcomes varied significantly by article type. Up-ranking curiosity and constructiveness improved a number of measures for the Politics article, including perceived Respect, Trustworthiness, and Interestingness of the comment section. Constructiveness also increased perceptions that the comments were favorable to Republicans, with no condition worsening perceptions of partisans. Additionally, in the Dining article, personal stories and constructiveness rankings significantly improved the perceived informativeness of the comments. Overall, these findings indicate that incorporating prosocial qualities of speech into ranking could be a promising approach to promote healthier, less polarized dialogue in online comment sections.

Read more

4/17/2024