Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models

2406.12416

Published 6/28/2024 by Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu, Jun Zhao

Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models

Abstract

Large language models (LLMs) have achieved remarkable success but still tend to generate factually erroneous responses, a phenomenon known as hallucination. A recent trend is to use preference learning to fine-tune models to align with factuality. However, existing work primarily evaluates fine-tuned models on in-domain (ID) datasets and the factuality on out-of-domain (OOD) datasets remains underexplored. In this paper, we conduct a comprehensive evaluation of the factuality of different models tuned by various preference learning algorithms and demonstrate that their performance on OOD datasets either increases minimally or decreases. Subsequently, we reveal that the main cause of model's failure to uphold factuality under a distribution shift is textbf{under-alignment}, rather than textbf{over-alignment}, by analyzing the token distribution shift of the models before and after tuning. Finally, we propose textbf{APEFT} (textbf{A}tomic textbf{P}reference textbf{E}nhanced textbf{F}actuality textbf{T}uning), a framework that enhances model's awareness of factuality at the granularity of individual facts. Extensive experiments demonstrate that APEFT improves model performance by an average of $boldsymbol{3.45%}$ on both ID and OOD datasets, which is highly effective.

Create account to get full access

Overview

The paper provides instructions for submitting papers to *ACL proceedings, which are academic conferences focused on natural language processing and computational linguistics.
The instructions cover important details like formatting guidelines, submission deadlines, and review processes to help authors prepare their work for publication.
Following these guidelines ensures papers are presented in a consistent, high-quality manner across the conference proceedings.

Plain English Explanation

The Instructions for *ACL Proceedings document outlines the requirements and procedures for publishing research papers in the proceedings of conferences organized by the Association for Computational Linguistics (ACL) and related organizations. These conferences are premier venues for presenting advancements in natural language processing, machine translation, dialogue systems, and other areas of computational linguistics.

The instructions cover essential details that authors need to know, such as the proper formatting for paper submissions, important deadlines, and the peer review process their work will go through. Following these guidelines ensures a consistent look and feel across all the papers included in the final conference proceedings, which are widely read by researchers and practitioners in the field.

While the instructions may seem technical, they are designed to help authors navigate the submission process smoothly and increase their chances of having their work accepted and published. Understanding these guidelines is a crucial step for researchers who want to share their findings with the broader computational linguistics community.

Technical Explanation

The Instructions for *ACL Proceedings document provides detailed guidance for authors submitting papers to conferences organized by the Association for Computational Linguistics (ACL) and related organizations. These conferences are premier venues for researchers to present their latest advancements in natural language processing, machine translation, dialogue systems, and other areas of computational linguistics.

The instructions cover a range of important details, including:

Formatting requirements for paper submissions, such as page limits, font sizes, and citation styles
Deadlines for different stages of the submission process, including initial paper submission, camera-ready revisions, and registration
The peer review process papers will go through, including the role of area chairs and reviewers in evaluating submissions
Guidelines for creating high-quality figures, tables, and other visual elements to include in papers
Best practices for protecting author anonymity during the double-blind review process

By following these guidelines, authors can ensure their papers are presented in a consistent format that aligns with the standards of the targeted conference. This helps create a cohesive proceedings volume that is easy for attendees to navigate and digest the latest research.

Critical Analysis

The Instructions for *ACL Proceedings provide a comprehensive and well-structured set of guidelines for authors, but there are a few potential limitations worth considering:

The instructions may be overly prescriptive in certain formatting requirements, potentially limiting authors' creativity or preventing the inclusion of unconventional research presentations.
The double-blind review process, while common in academic publishing, can introduce biases that disadvantage certain authors, such as those from underrepresented groups. Exploring more equitable review approaches may be an area for future discussion.
The instructions do not address the growing importance of open science practices, such as data and code sharing, which are becoming increasingly expected in computational linguistics and other fields.

Overall, the instructions serve an important function in maintaining high standards and consistency across ACL conference proceedings. However, as the field evolves, there may be opportunities to revisit some of the guidelines to ensure they continue to support the diverse and innovative research being conducted in computational linguistics.

Conclusion

The Instructions for *ACL Proceedings are a critical resource for authors looking to publish their research in the proceedings of conferences organized by the Association for Computational Linguistics (ACL) and related organizations. By outlining the formatting requirements, submission deadlines, and review processes, these instructions help ensure a consistent, high-quality presentation of the latest advancements in natural language processing, machine translation, and other areas of computational linguistics.

While the instructions may seem technical, they play a vital role in facilitating the effective dissemination of research findings within the broader computational linguistics community. By following these guidelines, authors can increase their chances of having their work accepted and included in the prestigious ACL conference proceedings, which are widely read and influential in the field.

As the landscape of computational linguistics continues to evolve, there may be opportunities to revisit some aspects of the instructions to ensure they remain responsive to the needs and practices of the research community. However, the core purpose of the instructions – to promote rigorous, high-quality research presentations – remains essential for the continued advancement of the field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

FLAME: Factuality-Aware Alignment for Large Language Models

Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen

Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance the factual accuracy of LLMs, and often leads to the generation of more false facts (i.e. hallucination). In this paper, we study how to make the LLM alignment process more factual, by first identifying factors that lead to hallucination in both alignment steps: supervised fine-tuning (SFT) and reinforcement learning (RL). In particular, we find that training the LLM on new knowledge or unfamiliar texts can encourage hallucination. This makes SFT less factual as it trains on human labeled data that may be novel to the LLM. Furthermore, reward functions used in standard RL can also encourage hallucination, because it guides the LLM to provide more helpful responses on a diverse set of instructions, often preferring longer and more detailed responses. Based on these observations, we propose factuality-aware alignment, comprised of factuality-aware SFT and factuality-aware RL through direct preference optimization. Experiments show that our proposed factuality-aware alignment guides LLMs to output more factual responses while maintaining instruction-following capability.

5/3/2024

cs.CL cs.AI

⚙️

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng

Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e. hallucinations, even when they hold relevant knowledge. To address these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality. Specifically, we incorporate Self-Eval, a self-evaluation component, to prompt an LLM to validate the factuality of its own generated responses solely based on its internal knowledge. Additionally, we design Self-Knowledge Tuning (SK-Tuning) to augment the LLM's self-evaluation ability by improving the model's confidence estimation and calibration. We then utilize these self-annotated responses to fine-tune the model via Direct Preference Optimization algorithm. We show that the proposed self-alignment approach substantially enhances factual accuracy over Llama family models across three key knowledge-intensive tasks on TruthfulQA and BioGEN.

6/12/2024

cs.CL cs.AI

Understanding Finetuning for Factual Knowledge Extraction

Gaurav Ghosal, Tatsunori Hashimoto, Aditi Raghunathan

In this work, we study the impact of QA fine-tuning data on downstream factuality. We show that fine-tuning on lesser-known facts that are poorly stored during pretraining yields significantly worse factuality than fine-tuning on well-known facts, even when all facts are seen during pretraining. We prove this phenomenon theoretically, showing that training on lesser-known facts can lead the model to ignore subject entity names and instead output a generic plausible response even when the relevant factual knowledge is encoded in the model. On three question answering benchmarks (PopQA, Entity Questions, and MMLU) and two language models (Llama-2-7B and Mistral-7B), we find that (i) finetuning on a completely factual but lesser-known subset of the data deteriorates downstream factuality (5-10%) and (ii) finetuning on a subset of better-known examples matches or outperforms finetuning on the entire dataset. Ultimately, our results shed light on the interaction between pretrained knowledge and finetuning data and demonstrate the importance of taking into account how facts are stored in the pretrained model when fine-tuning for knowledge-intensive tasks.

6/24/2024

cs.CL cs.LG

🧠

Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall

Jiaqing Yuan, Lin Pan, Chung-Wei Hang, Jiang Guo, Jiarong Jiang, Bonan Min, Patrick Ng, Zhiguo Wang

Large language models (LLMs) have shown remarkable performance on a variety of NLP tasks, and are being rapidly adopted in a wide range of use cases. It is therefore of vital importance to holistically evaluate the factuality of their generated outputs, as hallucinations remain a challenging issue. In this work, we focus on assessing LLMs' ability to recall factual knowledge learned from pretraining, and the factors that affect this ability. To that end, we construct FACT-BENCH, a representative benchmark covering 20 domains, 134 property types, 3 answer types, and different knowledge popularity levels. We benchmark 31 models from 10 model families and provide a holistic assessment of their strengths and weaknesses. We observe that instruction-tuning hurts knowledge recall, as pretraining-only models consistently outperform their instruction-tuned counterparts, and positive effects of model scaling, as larger models outperform smaller ones for all model families. However, the best performance from GPT-4 still represents a large gap with the upper-bound. We additionally study the role of in-context exemplars using counterfactual demonstrations, which lead to significant degradation of factual knowledge recall for large models. By further decoupling model known and unknown knowledge, we find the degradation is attributed to exemplars that contradict a model's known knowledge, as well as the number of such exemplars. Lastly, we fine-tune LLaMA-7B in different settings of known and unknown knowledge. In particular, fine-tuning on a model's known knowledge is beneficial, and consistently outperforms fine-tuning on unknown and mixed knowledge. We will make our benchmark publicly available.

4/26/2024

cs.CL cs.AI cs.LG