Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models






Published 5/21/2024 by Paula Akemi Aoyagui, Sharon Ferguson, Anastasia Kuzminykh



An essential aspect of evaluating Large Language Models (LLMs) is identifying potential biases. This is especially relevant considering the substantial evidence that LLMs can replicate human social biases in their text outputs and further influence stakeholders, potentially amplifying harm to already marginalized individuals and communities. Therefore, recent efforts in bias detection invested in automated benchmarks and objective metrics such as accuracy (i.e., an LLMs output is compared against a predefined ground truth). Nonetheless, social biases can be nuanced, oftentimes subjective and context-dependent, where a situation is open to interpretation and there is no ground truth. While these situations can be difficult for automated evaluation systems to identify, human evaluators could potentially pick up on these nuances. In this paper, we discuss the role of human evaluation and subjective interpretation to augment automated processes when identifying biases in LLMs as part of a human-centred approach to evaluate these models.

  • This paper explores the use of subjective human evaluations to assess social biases in large language models (LLMs), rather than relying solely on objective metrics.
  • The authors argue that existing bias evaluation methods may not capture the nuanced and context-dependent nature of human perceptions of bias.
  • They propose a new framework that incorporates both objective and subjective assessments, with the goal of developing more human-centric methods for evaluating the social impacts of LLMs.

Plain English Explanation

Large language models (LLMs) are powerful AI systems that can generate human-like text. However, these models can sometimes exhibit biases, such as favoring certain genders or races over others. Traditionally, researchers have used objective metrics to measure and assess these biases.

The authors of this paper argue that these objective metrics may not tell the whole story. They suggest that human perceptions of bias can be more nuanced and context-dependent than what the numbers show. For example, a statement that may seem biased to one person may not be perceived as such by another.

To address this, the researchers propose a new framework that combines objective and subjective assessments of bias. They believe this approach can lead to a more comprehensive and human-centric understanding of the social impacts of LLMs.

By considering both the quantitative data and the subjective experiences of people, the researchers hope to develop better methods for evaluating and mitigating biases in these powerful AI systems. This could ultimately lead to the creation of more equitable and inclusive language models that are better aligned with human values.

Technical Explanation

The paper begins by discussing the limitations of existing approaches to evaluating social biases in large language models (LLMs). The authors argue that current methods, which rely heavily on objective metrics like word embedding associations, may not fully capture the nuanced, context-dependent nature of human perceptions of bias.

To address this, the researchers propose a new framework that incorporates both objective and subjective assessments of bias. The objective component involves applying established bias measurement techniques, such as those used in https://aimodels.fyi/papers/arxiv/large-language-models-are-inconsistent-biased-evaluators and https://aimodels.fyi/papers/arxiv/concerns-bias-large-language-models-when-creating.

The subjective component involves collecting human evaluations of bias, drawing inspiration from approaches like https://aimodels.fyi/papers/arxiv/rlrfreinforcement-learning-from-reflection-through-debates-as and https://aimodels.fyi/papers/arxiv/just-like-me-role-opinions-personal-experiences. The researchers conducted user studies to gather human assessments of bias in specific language model outputs, exploring factors such as personal experiences and opinions.

By combining these objective and subjective measures, the authors aim to develop a more holistic understanding of social biases in LLMs, as demonstrated in https://aimodels.fyi/papers/arxiv/bias-patterns-application-llms-clinical-decision-support. They believe this approach can lead to the creation of bias evaluation methods that are better aligned with human values and experiences.

Critical Analysis

The authors acknowledge several limitations and areas for further research in their proposed framework. For example, they note that subjective evaluations of bias can be influenced by individual biases and backgrounds, which may introduce additional complexities to the assessment process.

Additionally, the researchers highlight the challenge of scaling up their approach to handle the vast amounts of data and language model outputs that need to be evaluated. Developing efficient and scalable methods for collecting and analyzing human-centric bias assessments remains an open challenge.

Another potential issue is the subjectivity of the human evaluations themselves. While the authors argue that this subjectivity is a strength of their approach, it also raises questions about the reliability and generalizability of the findings. Careful study design and statistical analysis will be crucial to ensure the robustness of the results.

Despite these limitations, the paper's focus on incorporating human perspectives into the assessment of social biases in LLMs is a valuable contribution to the field. By acknowledging the importance of subjective experiences, the authors encourage the AI research community to think more critically about the social impacts of these powerful language models and to develop more nuanced and inclusive evaluation methods.


This paper presents a novel framework for evaluating social biases in large language models (LLMs) that combines objective and subjective assessments. The authors argue that existing bias evaluation methods may not capture the full complexity of human perceptions of bias, which can be influenced by individual experiences and perspectives.

By incorporating both quantitative and qualitative measures, the proposed approach aims to develop a more human-centric understanding of the social impacts of LLMs. While the framework faces some challenges related to scalability and subjectivity, the authors' emphasis on the importance of human-centered evaluation is a valuable contribution to the ongoing discussion around the societal implications of these powerful AI systems.

As the development and deployment of LLMs continues to accelerate, the need for comprehensive, multifaceted bias assessment methods becomes increasingly crucial. This paper serves as an important step towards the creation of more equitable and inclusive language models that better align with human values and experiences.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

