Leveraging Human Revisions for Improving Text-to-Layout Models

2405.13026

YC

0

Reddit

0

Published 5/24/2024 by Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang Li

Abstract

Learning from human feedback has shown success in aligning large, pretrained models with human values. Prior works have mostly focused on learning from high-level labels, such as preferences between pairs of model outputs. On the other hand, many domains could benefit from more involved, detailed feedback, such as revisions, explanations, and reasoning of human users. Our work proposes using nuanced feedback through the form of human revisions for stronger alignment. In this paper, we ask expert designers to fix layouts generated from a generative layout model that is pretrained on a large-scale dataset of mobile screens. Then, we train a reward model based on how human designers revise these generated layouts. With the learned reward model, we optimize our model with reinforcement learning from human feedback (RLHF). Our method, Revision-Aware Reward Models ($method$), allows a generative text-to-layout model to produce more modern, designer-aligned layouts, showing the potential for utilizing human revisions and stronger forms of feedback in improving generative models.

Create account to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper proposes a novel method called "Revision-Aware Reward Models" (RARM) for aligning large, pretrained generative models with human values and design preferences.
  • Instead of relying on high-level labels or preferences, RARM leverages more nuanced human feedback in the form of revisions, explanations, and reasoning.
  • The authors demonstrate the effectiveness of RARM by training a generative layout model to produce mobile screen designs that are more aligned with the aesthetic preferences of human designers.

Plain English Explanation

The researchers wanted to find a better way to align large AI models with human values and preferences. Prior methods had mostly focused on getting high-level feedback, like preferences between different model outputs. However, many areas could benefit from more detailed feedback, such as revisions, explanations, and reasoning from human users.

In this paper, the authors proposed a new approach called "Revision-Aware Reward Models" (RARM). The idea is to train the AI model based on how human experts revise and fix the outputs it generates. By learning from these detailed revisions, the model can produce outputs that are more closely aligned with human design preferences.

The researchers applied RARM to train a generative layout model that can produce mobile screen designs. They had expert designers fix the layouts generated by the original model, and then used those revisions to train a new reward model. This reward model was then used to optimize the layout generation model, resulting in designs that were more modern and better aligned with the preferences of human designers.

The key insight is that leveraging nuanced human feedback, like revisions and explanations, can lead to stronger alignment between AI models and human values, compared to just using high-level preferences. This work demonstrates the potential for using more involved forms of human feedback to improve the capabilities and safety of generative AI systems.

Technical Explanation

The authors of this paper proposed a novel method called "Revision-Aware Reward Models" (RARM) for aligning large, pretrained generative models with human values and preferences. Unlike previous approaches that mainly used high-level labels or preferences, RARM leverages more nuanced human feedback in the form of revisions, explanations, and reasoning.

To demonstrate the effectiveness of RARM, the researchers applied it to the task of training a generative layout model to produce mobile screen designs. First, they pretrained a layout generation model on a large-scale dataset of mobile screens. Then, they had expert designers review and revise the layouts generated by this model.

Using these human revisions, the researchers trained a reward model that could predict how well a given layout would be received by the designers. Finally, they used this reward model to optimize the layout generation model through reinforcement learning from human feedback (RLHF).

The results showed that the RARM-trained layout model was able to generate designs that were more modern and better aligned with the aesthetic preferences of the human designers, compared to the original pretrained model. This highlights the potential of utilizing more involved forms of human feedback, such as revisions and explanations, to improve the alignment of generative AI systems with human values.

Critical Analysis

The researchers acknowledge that their work is an initial step towards leveraging nuanced human feedback for model alignment, and there are several areas for further exploration. For example, the study was limited to a specific domain (mobile screen design) and relied on feedback from a relatively small number of expert designers. It would be valuable to investigate the generalization of RARM to other domains and a broader set of human evaluators.

Additionally, the paper does not provide a detailed analysis of the types of revisions made by the human designers or the reasoning behind their decisions. A deeper understanding of the revision process and the underlying human values and preferences could lead to even more effective alignment strategies.

Another potential limitation is the reliance on reinforcement learning from human feedback (RLHF), which has been the subject of ongoing debates and concerns regarding its scalability, robustness, and safety implications. Further research is needed to address these challenges and ensure the responsible development of RARM-like approaches.

Conclusion

This paper presents a promising approach called "Revision-Aware Reward Models" (RARM) for aligning large, pretrained generative models with human values and preferences. By leveraging detailed human feedback in the form of revisions, explanations, and reasoning, RARM can produce outputs that are more closely aligned with the aesthetic preferences of human designers.

The researchers demonstrated the effectiveness of RARM in the context of a generative layout model for mobile screens, but the potential applications extend to a wide range of domains where AI systems interact with humans. As the field of AI continues to advance, exploring more nuanced forms of human feedback will be crucial for developing systems that are truly aligned with human values and can be safely deployed in real-world settings.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Revision Matters: Generative Design Guided by Revision Edits

Revision Matters: Generative Design Guided by Revision Edits

Tao Li, Chin-Yi Cheng, Amber Xie, Gang Li, Yang Li

YC

0

Reddit

0

Layout design, such as user interface or graphical layout in general, is fundamentally an iterative revision process. Through revising a design repeatedly, the designer converges on an ideal layout. In this paper, we investigate how revision edits from human designer can benefit a multimodal generative model. To do so, we curate an expert dataset that traces how human designers iteratively edit and improve a layout generation with a prompted language goal. Based on such data, we explore various supervised fine-tuning task setups on top of a Gemini multimodal backbone, a large multimodal model. Our results show that human revision plays a critical role in iterative layout refinement. While being noisy, expert revision edits lead our model to a surprisingly strong design FID score ~10 which is close to human performance (~6). In contrast, self-revisions that fully rely on model's own judgement, lead to an echo chamber that prevents iterative improvement, and sometimes leads to generative degradation. Fortunately, we found that providing human guidance plays at early stage plays a critical role in final generation. In such human-in-the-loop scenario, our work paves the way for iterative design revision based on pre-trained large multimodal models.

Read more

6/28/2024

💬

The Real, the Better: Aligning Large Language Models with Online Human Behaviors

Guanying Jiang, Lingyong Yan, Haibo Shi, Dawei Yin

YC

0

Reddit

0

Large language model alignment is widely used and studied to avoid LLM producing unhelpful and harmful responses. However, the lengthy training process and predefined preference bias hinder adaptation to online diverse human preferences. To this end, this paper proposes an alignment framework, called Reinforcement Learning with Human Behavior (RLHB), to align LLMs by directly leveraging real online human behaviors. By taking the generative adversarial framework, the generator is trained to respond following expected human behavior; while the discriminator tries to verify whether the triplets of query, response, and human behavior come from real online environments. Behavior modeling in natural-language form and the multi-model joint training mechanism enable an active and sustainable online alignment. Experimental results confirm the effectiveness of our proposed methods by both human and automatic evaluations.

Read more

5/2/2024

💬

Aligning language models with human preferences

Tomasz Korbak

YC

0

Reddit

0

Language models (LMs) trained on vast quantities of text data can acquire sophisticated skills such as generating summaries, answering questions or generating code. However, they also manifest behaviors that violate human preferences, e.g., they can generate offensive content, falsehoods or perpetuate social biases. In this thesis, I explore several approaches to aligning LMs with human preferences. First, I argue that aligning LMs can be seen as Bayesian inference: conditioning a prior (base, pretrained LM) on evidence about human preferences (Chapter 2). Conditioning on human preferences can be implemented in numerous ways. In Chapter 3, I investigate the relation between two approaches to finetuning pretrained LMs using feedback given by a scoring function: reinforcement learning from human feedback (RLHF) and distribution matching. I show that RLHF can be seen as a special case of distribution matching but distributional matching is strictly more general. In chapter 4, I show how to extend the distribution matching to conditional language models. Finally, in chapter 5 I explore a different root: conditioning an LM on human preferences already during pretraining. I show that involving human feedback from the very start tends to be more effective than using it only during supervised finetuning. Overall, these results highlight the room for alignment techniques different from and complementary to RLHF.

Read more

4/19/2024

Towards Understanding the Influence of Reward Margin on Preference Model Performance

Towards Understanding the Influence of Reward Margin on Preference Model Performance

Bowen Qin, Duanyu Feng, Xi Yang

YC

0

Reddit

0

Reinforcement Learning from Human Feedback (RLHF) is a widely used framework for the training of language models. However, the process of using RLHF to develop a language model that is well-aligned presents challenges, especially when it comes to optimizing the reward model. Our research has found that existing reward models, when trained using the traditional ranking objective based on human preference data, often struggle to effectively distinguish between responses that are more or less favorable in real-world scenarios. To bridge this gap, our study introduces a novel method to estimate the preference differences without the need for detailed, exhaustive labels from human annotators. Our experimental results provide empirical evidence that incorporating margin values into the training process significantly improves the effectiveness of reward models. This comparative analysis not only demonstrates the superiority of our approach in terms of reward prediction accuracy but also highlights its effectiveness in practical applications.

Read more

4/9/2024