Algorithmic Collective Action in Machine Learning

Read original: arXiv:2302.04262 - Published 8/9/2024 by Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dunner, Tijana Zrnic

🐍

Overview

The paper proposes a theoretical model of how a "collective" of individuals can influence a platform's machine learning algorithms to achieve a shared goal.
The researchers investigate this in three different learning settings: nonparametric optimal learning, parametric risk minimization, and gradient-based optimization.
The paper also presents experimental results from a skills classification task using a BERT-like language model, which align with the theoretical predictions.
The key finding is that even a small collective can exert significant control over a platform's learning algorithms.

Plain English Explanation

The paper examines how groups of people can work together to influence the machine learning algorithms used by digital platforms. The researchers created a simple theoretical model to describe this process.

In their model, a "collective" of individuals pools their data and instructs members on how to modify their data. The goal is to achieve a collective objective by manipulating the platform's learning algorithm.

The researchers looked at three different types of learning algorithms the platform might use: nonparametric optimal learning, parametric risk minimization, and gradient-based optimization. For each setting, they developed coordinated strategies the collective could use and analyzed how the collective's size affects their chances of success.

To complement the theory, the researchers also ran experiments on a real-world skills classification task using a BERT-like language model. They found a strong alignment between their experimental observations and the predictions made by their theoretical model.

The key takeaway is that even a relatively small collective of people can significantly influence a platform's learning algorithms to achieve their shared goals. This suggests that algorithmic collectives could become an important force on digital platforms going forward.

Technical Explanation

The paper proposes a theoretical framework for understanding algorithmic collective action on digital platforms that use machine learning. The core idea is that a "collective" of individuals can pool their data and coordinate strategies to manipulate the platform's learning algorithms in service of their collective goals.

The researchers developed a simple model of this process and investigated its implications in three fundamental learning settings:

Nonparametric optimal learning: The platform uses the best possible learning algorithm for the given data.
Parametric risk minimization: The platform learns a parametric model to minimize a predefined loss function.
Gradient-based optimization: The platform uses gradient descent to optimize a neural network.

For each setting, the researchers devised coordinated strategies the collective could use and characterized the relationship between the collective's size and their chances of success.

To validate their theoretical predictions, the researchers also conducted systematic experiments on a real-world skills classification task using a BERT-like language model. They ran over 2,000 model training runs and found a strong correspondence between their empirical observations and the theoretical results.

The key takeaway from both the theory and experiments is that even a small collective of participants can exert significant control over a platform's learning algorithms. This suggests that algorithmic collectives could become an important force on digital platforms in the future.

Critical Analysis

The paper presents a well-designed theoretical framework and complementary empirical evaluation. However, there are a few potential limitations and areas for further research:

The theoretical model makes some simplifying assumptions, such as assuming the collective has perfect knowledge of the platform's learning algorithm. In reality, collectives may have incomplete or uncertain information about the algorithm.
The experimental setup uses a relatively narrow task (skills classification) and a specific model architecture (BERT-like). It would be valuable to explore a wider range of tasks and model types to validate the generalizability of the findings.
The paper does not delve into the ethical implications of algorithmic collectives and their potential to be used for manipulative or harmful purposes. Further research is needed to understand the societal impacts of this phenomenon.

Overall, the paper makes an important contribution to our understanding of how collectives can influence AI systems and opens up new avenues for research on the interplay between collective action and machine learning.

Conclusion

This paper presents a novel theoretical and empirical investigation of how "algorithmic collectives" can influence the machine learning algorithms used by digital platforms. The key finding is that even small groups of people can significantly impact a platform's learning algorithms to achieve their collective goals.

The theoretical model and experimental results suggest that algorithmic collectives could become an increasingly important force on digital platforms in the future. This raises important questions about the ethical implications and societal impacts of this phenomenon that warrant further research.

By shedding light on the potential power of algorithmic collectives, this paper contributes to our understanding of the complex interplay between collective action and the increasing use of machine learning in digital platforms and services.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🐍

Algorithmic Collective Action in Machine Learning

Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dunner, Tijana Zrnic

We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collective goal. We investigate the consequences of this model in three fundamental learning-theoretic settings: the case of a nonparametric optimal learning algorithm, a parametric risk minimizer, and gradient-based optimization. In each setting, we come up with coordinated algorithmic strategies and characterize natural success criteria as a function of the collective's size. Complementing our theory, we conduct systematic experiments on a skill classification task involving tens of thousands of resumes from a gig platform for freelancers. Through more than two thousand model training runs of a BERT-like language model, we see a striking correspondence emerge between our empirical observations and the predictions made by our theory. Taken together, our theory and experiments broadly support the conclusion that algorithmic collectives of exceedingly small fractional size can exert significant control over a platform's learning algorithm.

8/9/2024

The Role of Learning Algorithms in Collective Action

Omri Ben-Dov, Jake Fawkes, Samira Samadi, Amartya Sanyal

Collective action in machine learning is the study of the control that a coordinated group can have over machine learning algorithms. While previous research has concentrated on assessing the impact of collectives against Bayes (sub-)optimal classifiers, this perspective is limited in that it does not account for the choice of learning algorithm. Since classifiers seldom behave like Bayes classifiers and are influenced by the choice of learning algorithms along with their inherent biases, in this work we initiate the study of how the choice of the learning algorithm plays a role in the success of a collective in practical settings. Specifically, we focus on distributionally robust optimization (DRO), popular for improving a worst group error, and on the ubiquitous stochastic gradient descent (SGD), due to its inductive bias for simpler functions. Our empirical results, supported by a theoretical foundation, show that the effective size and success of the collective are highly dependent on properties of the learning algorithm. This highlights the necessity of taking the learning algorithm into account when studying the impact of collective action in machine learning.

6/5/2024

Algorithmic Collective Action in Recommender Systems: Promoting Songs by Reordering Playlists

Joachim Baumann, Celestine Mendler-Dunner

We investigate algorithmic collective action in transformer-based recommender systems. Our use case is a collective of fans aiming to promote the visibility of an artist by strategically placing one of their songs in the existing playlists they control. The success of the collective is measured by the increase in test-time recommendations of the targeted song. We introduce two easily implementable strategies towards this goal and test their efficacy on a publicly available recommender system model released by a major music streaming platform. Our findings reveal that even small collectives (controlling less than 0.01% of the training data) can achieve up 25x amplification of recommendations by strategically choosing the position at which to insert the song. We then focus on investigating the externalities of the strategy. We find that the performance loss for the platform is negligible, and the recommendations of other songs are largely preserved, minimally impairing the user experience of participants. Moreover, the costs are evenly distributed among other artists. Taken together, our findings demonstrate how collective action strategies can be effective while not necessarily being adversarial, raising new questions around incentives, social dynamics, and equilibria in recommender systems.

4/9/2024

Evolving AI Collectives to Enhance Human Diversity and Enable Self-Regulation

Shiyang Lai, Yujin Potter, Junsol Kim, Richard Zhuang, Dawn Song, James Evans

Large language model behavior is shaped by the language of those with whom they interact. This capacity and their increasing prevalence online portend that they will intentionally or unintentionally program one another and form emergent AI subjectivities, relationships, and collectives. Here, we call upon the research community to investigate these societies of interacting artificial intelligences to increase their rewards and reduce their risks for human society and the health of online environments. We use a small community of models and their evolving outputs to illustrate how such emergent, decentralized AI collectives can spontaneously expand the bounds of human diversity and reduce the risk of toxic, anti-social behavior online. Finally, we discuss opportunities for AI cross-moderation and address ethical issues and design challenges associated with creating and maintaining free-formed AI collectives.

6/21/2024