Large Language Models (LLMs) as Agents for Augmented Democracy

2405.03452

Published 5/8/2024 by Jairo Gudi~no-Rosero, Umberto Grandi, C'esar A. Hidalgo

💬

Abstract

We explore the capabilities of an augmented democracy system built on off-the-shelf LLMs fine-tuned on data summarizing individual preferences across 67 policy proposals collected during the 2022 Brazilian presidential elections. We use a train-test cross-validation setup to estimate the accuracy with which the LLMs predict both: a subject's individual political choices and the aggregate preferences of the full sample of participants. At the individual level, the accuracy of the out of sample predictions lie in the range 69%-76% and are significantly better at predicting the preferences of liberal and college educated participants. At the population level, we aggregate preferences using an adaptation of the Borda score and compare the ranking of policy proposals obtained from a probabilistic sample of participants and from data augmented using LLMs. We find that the augmented data predicts the preferences of the full population of participants better than probabilistic samples alone when these represent less than 30% to 40% of the total population. These results indicate that LLMs are potentially useful for the construction of systems of augmented democracy.

Create account to get full access

Overview

This paper explores the potential of using large language models (LLMs) to augment democratic decision-making processes.
The researchers fine-tuned off-the-shelf LLMs on data summarizing individual preferences across 67 policy proposals from the 2022 Brazilian presidential election.
They used cross-validation to estimate the accuracy of the LLMs in predicting individual political choices and the aggregate preferences of the full participant sample.

Plain English Explanation

The researchers in this study wanted to see if they could use large language models (LLMs) to help improve the democratic process. They took existing LLM models and trained them on data about how individual people felt about 67 different political issues from the 2022 Brazilian presidential election.

By training the LLMs on this data, the researchers hoped the models would be able to accurately predict how individual people would vote on these issues, as well as predict the overall preferences of the whole population. This could potentially be useful for building systems of augmented democracy that leverage LLMs to model beliefs and preferences.

The key findings were:

The LLMs were able to predict individual voting choices with 69-76% accuracy, and were better at predicting the preferences of liberal and college-educated participants.
When aggregating the preferences of the whole population, the LLM-augmented data was better at predicting the overall preferences than just relying on a random sample of the population, as long as the sample size was less than 30-40% of the total population.

This suggests that LLMs could potentially be useful for building systems that leverage the "wisdom of the silicon crowd" to improve democratic decision-making, by using LLMs to create synthetic participatory planning.

Technical Explanation

The researchers used a train-test cross-validation approach to evaluate the performance of the LLMs in two key areas:

Predicting individual political choices: The LLMs were able to predict individual voting preferences on the 67 policy proposals with an accuracy between 69-76%. Interestingly, the models performed better at predicting the choices of liberal and college-educated participants.
Predicting aggregate population preferences: The researchers aggregated the individual preferences using an adaptation of the Borda scoring method. They found that the LLM-augmented data was better at predicting the overall ranking of policy proposals compared to relying only on a probabilistic sample of the population, as long as the sample size was less than 30-40% of the total population.

These results suggest that LLMs could be a useful tool for constructing systems of augmented democracy that leverage the predictive power of language models to better understand and represent the preferences of the broader population.

Critical Analysis

The researchers acknowledge several limitations and caveats in their work:

The dataset was relatively small, covering only 67 policy proposals from a single election. Expanding the scope and scale of the data could help validate the findings.
The LLMs were trained on self-reported survey data, which may not fully reflect real-world voting behavior. Incorporating additional data sources could improve the models' predictive accuracy.
The researchers did not explore the potential biases or fairness issues that could arise from using LLMs to model political preferences. This is an important area for further research.

Additionally, one could question whether using LLMs to augment democratic processes is truly desirable, even if the models demonstrate strong predictive performance. There are valid concerns about the transparency and accountability of such systems, as well as the potential for them to be misused or to reinforce existing power structures.

Conclusion

Overall, this research provides an intriguing proof-of-concept for leveraging large language models to enhance democratic decision-making. The strong predictive performance of the LLMs, particularly in aggregating population-level preferences, suggests that such models could be a valuable tool for building more inclusive and representative democratic systems.

However, the researchers acknowledge the need for further exploration of the ethical and societal implications of using LLMs in this context. Careful consideration must be given to ensuring that such systems are transparent, fair, and aligned with the fundamental principles of democracy.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

LLM Voting: Human Choices and AI Collective Decision Making

Joshua C. Yang, Damian Dailisan, Marcin Korecki, Carina I. Hausladen, Dirk Helbing

This paper investigates the voting behaviors of Large Language Models (LLMs), specifically GPT-4 and LLaMA-2, their biases, and how they align with human voting patterns. Our methodology involved using a dataset from a human voting experiment to establish a baseline for human preferences and a corresponding experiment with LLM agents. We observed that the methods used for voting input and the presentation of choices influence LLM voting behavior. We discovered that varying the persona can reduce some of these biases and enhance alignment with human choices. While the Chain-of-Thought approach did not improve prediction accuracy, it has potential for AI explainability in the voting process. We also identified a trade-off between preference diversity and alignment accuracy in LLMs, influenced by different temperature settings. Our findings indicate that LLMs may lead to less diverse collective outcomes and biased assumptions when used in voting scenarios, emphasizing the importance of cautious integration of LLMs into democratic processes.

5/16/2024

cs.CL cs.AI cs.CY cs.LG

💬

AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction

Junsol Kim, Byungkyu Lee

Large language models (LLMs) that produce human-like responses have begun to revolutionize research practices in the social sciences. We develop a novel methodological framework that fine-tunes LLMs with repeated cross-sectional surveys to incorporate the meaning of survey questions, individual beliefs, and temporal contexts for opinion prediction. We introduce two new emerging applications of the AI-augmented survey: retrodiction (i.e., predict year-level missing responses) and unasked opinion prediction (i.e., predict entirely missing responses). Among 3,110 binarized opinions from 68,846 Americans in the General Social Survey from 1972 to 2021, our models based on Alpaca-7b excel in retrodiction (AUC = 0.86 for personal opinion prediction, $rho$ = 0.98 for public opinion prediction). These remarkable prediction capabilities allow us to fill in missing trends with high confidence and pinpoint when public attitudes changed, such as the rising support for same-sex marriage. On the other hand, our fine-tuned Alpaca-7b models show modest success in unasked opinion prediction (AUC = 0.73, $rho$ = 0.67). We discuss practical constraints and ethical concerns regarding individual autonomy and privacy when using LLMs for opinion prediction. Our study demonstrates that LLMs and surveys can mutually enhance each other's capabilities: LLMs can broaden survey potential, while surveys can improve the alignment of LLMs.

4/9/2024

cs.CL cs.AI cs.LG

✨

LLM-Augmented Agent-Based Modelling for Social Simulations: Challenges and Opportunities

Onder Gurcan

As large language models (LLMs) continue to make significant strides, their better integration into agent-based simulations offers a transformational potential for understanding complex social systems. However, such integration is not trivial and poses numerous challenges. Based on this observation, in this paper, we explore architectures and methods to systematically develop LLM-augmented social simulations and discuss potential research directions in this field. We conclude that integrating LLMs with agent-based simulations offers a powerful toolset for researchers and scientists, allowing for more nuanced, realistic, and comprehensive models of complex systems and human behaviours.

5/14/2024

cs.AI

➖

The Political Preferences of LLMs

David Rozado

I report here a comprehensive analysis about the political preferences embedded in Large Language Models (LLMs). Namely, I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs, both closed and open source. When probed with questions/statements with political connotations, most conversational LLMs tend to generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints. This does not appear to be the case for five additional base (i.e. foundation) models upon which LLMs optimized for conversation with humans are built. However, the weak performance of the base models at coherently answering the tests' questions makes this subset of results inconclusive. Finally, I demonstrate that LLMs can be steered towards specific locations in the political spectrum through Supervised Fine-Tuning (SFT) with only modest amounts of politically aligned data, suggesting SFT's potential to embed political orientation in LLMs. With LLMs beginning to partially displace traditional information sources like search engines and Wikipedia, the societal implications of political biases embedded in LLMs are substantial.

6/4/2024

cs.CY cs.AI cs.CL