The Future of Open Human Feedback

Read original: arXiv:2408.16961 - Published 9/5/2024 by Shachar Don-Yehiya, Ben Burtenshaw, Ramon Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin and 10 others

📶

Overview

This paper explores the future of open human feedback for AI systems.
It discusses the benefits and challenges of incorporating open-ended feedback from humans into the training and deployment of AI models.
The paper covers key considerations around planning and purpose, technical implementation, and critical analysis of open human feedback for AI.

Plain English Explanation

What is the purpose of this paper?

The paper aims to explore how AI systems can effectively incorporate open-ended feedback from humans. This is an important topic as AI becomes more prevalent in our lives - understanding how to harness human input can help ensure these systems are aligned with human values and preferences.

Why is open human feedback important for AI?

Open feedback from humans can provide valuable insights that help shape the development and application of AI. Humans can offer perspectives that machine learning algorithms may miss, highlighting potential biases, safety concerns, or desirable features. Incorporating this feedback can make AI systems more robust, trustworthy, and beneficial to society.

What are some key challenges with open human feedback for AI?

Integrating open-ended human feedback into AI development and deployment is not a simple task. The paper discusses challenges around scalability, consistency, incentives, and potential misuse of feedback systems. Careful design and implementation is required to harness the benefits of open feedback while mitigating these risks.

Technical Explanation

Feedback Collection and Aggregation

The paper examines different approaches for collecting and aggregating human feedback, such as crowdsourcing, expert panels, and opt-in user feedback. It explores the tradeoffs between scale, quality, and representativeness of the feedback data.

Feedback Integration into AI Training

The researchers discuss methods for incorporating open feedback into the training of AI models, including techniques like reward modeling, human-in-the-loop fine-tuning, and multi-task learning. The goal is to ensure the feedback is effectively translated into model improvements.

Feedback-Driven Model Deployment

The paper also examines how open feedback can be used to monitor and adapt AI systems during real-world deployment. This includes using feedback to identify problematic outputs, make targeted model updates, and foster ongoing dialogue between humans and AI.

Critical Analysis

Scalability and Consistency Challenges

The paper acknowledges that scaling open feedback systems to handle large volumes of input while maintaining consistency and quality is a significant challenge. Techniques for automated feedback processing and curation will be essential.

Incentives and Potential Misuse

The authors also discuss concerns around incentivizing productive feedback, as well as the risk of feedback systems being exploited or gamed. Safeguards and governance frameworks will be needed to ensure the integrity of open feedback for AI.

Representativeness and Equity

An important consideration raised in the paper is ensuring that open feedback channels are accessible and representative of diverse perspectives. There is a risk of feedback being skewed towards certain demographics or interest groups.

Conclusion

The paper highlights the significant potential of open human feedback to shape the development and deployment of AI systems in positive ways. However, it also underscores the technical and social challenges that must be addressed to realize this potential. Ongoing research and careful system design will be critical to enabling effective open feedback loops for AI.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📶

The Future of Open Human Feedback

Shachar Don-Yehiya, Ben Burtenshaw, Ramon Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin, Atoosa Kasirzadeh, Yangsibo Huang, Tatsunori Hashimoto, Yacine Jernite, Daniel Vila-Suero, Omri Abend, Jennifer Ding, Sara Hooker, Hannah Rose Kirk, Leshem Choshen

Human feedback on conversations with language language models (LLMs) is central to how these systems learn about the world, improve their capabilities, and are steered toward desirable and safe behaviors. However, this feedback is mostly collected by frontier AI labs and kept behind closed doors. In this work, we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for AI. We first look for successful practices in peer production, open source, and citizen science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the center of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholders community of model trainers and feedback providers to support a general open feedback pool.

9/5/2024

Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models

Ned Cooper, Alexandra Zafiroglu

Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces, shifting the dynamics of participation in AI development. This paper examines the affordances of interactive feedback features in ChatGPT's interface, analysing how they shape user input and participation in LLM iteration. Drawing on a survey of ChatGPT users and applying the mechanisms and conditions framework of affordances, we demonstrate that these features encourage simple, frequent, and performance-focused feedback while discouraging collective input and discussions among users. We argue that this feedback format significantly constrains user participation, reinforcing power imbalances between users, the public, and companies developing LLMs. Our analysis contributes to the growing body of literature on participatory AI by critically examining the limitations of existing feedback processes and proposing directions for their redesign. To enable more meaningful public participation in AI development, we advocate for a shift away from processes focused on aligning model outputs with specific user preferences. Instead, we emphasise the need for processes that facilitate dialogue between companies and diverse 'publics' about the purpose and applications of LLMs. This approach requires attention to the ongoing work of infrastructuring - creating and sustaining the social, technical, and institutional structures necessary to address matters of concern to groups impacted by AI development and deployment.

8/28/2024

Learning from Naturally Occurring Feedback

Shachar Don-Yehiya, Leshem Choshen, Omri Abend

Human feedback data is a critical component in developing language models. However, collecting this feedback is costly and ultimately not scalable. We propose a scalable method for extracting feedback that users naturally include when interacting with chat models, and leveraging it for model training. We are further motivated by previous work that showed there are also qualitative advantages to using naturalistic (rather than auto-generated) feedback, such as less hallucinations and biases. We manually annotated conversation data to confirm the presence of naturally occurring feedback in a standard corpus, finding that as much as 30% of the chats include explicit feedback. We apply our method to over 1M conversations to obtain hundreds of thousands of feedback samples. Training with the extracted feedback shows significant performance improvements over baseline models, demonstrating the efficacy of our approach in enhancing model alignment to human preferences.

7/16/2024

💬

UltraFeedback: Boosting Language Models with Scaled AI Feedback

Ganqu Cui, Lifan Yuan, Ning Ding, Guanming Yao, Bingxiang He, Wei Zhu, Yuan Ni, Guotong Xie, Ruobing Xie, Yankai Lin, Zhiyuan Liu, Maosong Sun

Learning from human feedback has become a pivot technique in aligning large language models (LLMs) with human preferences. However, acquiring vast and premium human feedback is bottlenecked by time, labor, and human capability, resulting in small sizes or limited topics of current datasets. This further hinders feedback learning as well as alignment research within the open-source community. To address this issue, we explore how to go beyond human feedback and collect high-quality textit{AI feedback} automatically for a scalable alternative. Specifically, we identify textbf{scale and diversity} as the key factors for feedback data to take effect. Accordingly, we first broaden instructions and responses in both amount and breadth to encompass a wider range of user-assistant interactions. Then, we meticulously apply a series of techniques to mitigate annotation biases for more reliable AI feedback. We finally present textsc{UltraFeedback}, a large-scale, high-quality, and diversified AI feedback dataset, which contains over 1 million GPT-4 feedback for 250k user-assistant conversations from various aspects. Built upon textsc{UltraFeedback}, we align a LLaMA-based model by best-of-$n$ sampling and reinforcement learning, demonstrating its exceptional performance on chat benchmarks. Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models, serving as a solid foundation for future feedback learning research. Our data and models are available at https://github.com/thunlp/UltraFeedback.

7/17/2024