Algorithmic Transparency and Participation through the Handoff Lens: Lessons Learned from the U.S. Census Bureau's Adoption of Differential Privacy

2405.19187

Published 5/30/2024 by Amina A. Abdu, Lauren M. Chambers, Deirdre K. Mulligan, Abigail Z. Jacobs

🌿

Abstract

Emerging discussions on the responsible government use of algorithmic technologies propose transparency and public participation as key mechanisms for preserving accountability and trust. But in practice, the adoption and use of any technology shifts the social, organizational, and political context in which it is embedded. Therefore translating transparency and participation efforts into meaningful, effective accountability must take into account these shifts. We adopt two theoretical frames, Mulligan and Nissenbaum's handoff model and Star and Griesemer's boundary objects, to reveal such shifts during the U.S. Census Bureau's adoption of differential privacy (DP) in its updated disclosure avoidance system (DAS) for the 2020 census. This update preserved (and arguably strengthened) the confidentiality protections that the Bureau is mandated to uphold, and the Bureau engaged in a range of activities to facilitate public understanding of and participation in the system design process. Using publicly available documents concerning the Census' implementation of DP, this case study seeks to expand our understanding of how technical shifts implicate values, how such shifts can afford (or fail to afford) greater transparency and participation in system design, and the importance of localized expertise throughout. We present three lessons from this case study toward grounding understandings of algorithmic transparency and participation: (1) efforts towards transparency and participation in algorithmic governance must center values and policy decisions, not just technical design decisions; (2) the handoff model is a useful tool for revealing how such values may be cloaked beneath technical decisions; and (3) boundary objects alone cannot bridge distant communities without trusted experts traveling alongside to broker their adoption.

Create account to get full access

Overview

This paper examines the U.S. Census Bureau's adoption of differential privacy, a technique to protect individual privacy in data analysis, and the challenges it faced around algorithmic transparency and public participation.
The authors use the "handoff" lens to analyze how the Census Bureau navigated the transition from traditional statistical disclosure limitation methods to differential privacy, and the lessons learned.
Key topics include balancing privacy guarantees, stakeholder engagement, and communicating the tradeoffs involved in adopting differential privacy.

Plain English Explanation

The U.S. Census Bureau is responsible for collecting and reporting critical demographic data about the American population. In recent years, they have started using a new privacy protection technique called differential privacy to help safeguard the privacy of individuals in the census data.

Differential privacy works by intentionally adding a small amount of "noise" or uncertainty to the data, making it much harder for anyone to identify specific individuals. This helps protect people's privacy, but it also changes the data in ways that can affect the accuracy and usefulness of the census results.

The transition to differential privacy has been challenging for the Census Bureau. They've had to figure out how to balance the need for privacy with the need for accurate, transparent data that the public can trust. This paper looks at the lessons the Census Bureau has learned through this process, using a "handoff" framework to understand how they've navigated the tradeoffs and stakeholder engagement.

Some key insights include the importance of clearly communicating the privacy-accuracy tradeoffs, the value of involving diverse stakeholders in the decision-making process, and the need for ongoing monitoring and adjustment as the Census Bureau continues to adopt these new privacy techniques.

Technical Explanation

The paper examines the U.S. Census Bureau's adoption of differential privacy, a privacy-preserving data analysis technique, and the associated challenges around algorithmic transparency and public participation.

The authors use a "handoff" lens to analyze how the Census Bureau navigated the transition from traditional statistical disclosure limitation methods to differential privacy. This involves understanding the technical details of the differential privacy implementation, as well as the organizational and stakeholder dynamics involved.

Key elements of the technical explanation include:

The Census Bureau's motivations for adopting differential privacy, including increasing privacy guarantees and responding to external pressures
The process of implementing differential privacy, including defining privacy loss budgets and evaluating accuracy trade-offs
Challenges in communicating the technical details and implications of differential privacy to diverse stakeholders, including policymakers, researchers, and the general public
Efforts to engage stakeholders and incorporate feedback into the differential privacy design and deployment

The paper also discusses the Census Bureau's approach to monitoring and adjusting the differential privacy implementation over time, as well as ongoing challenges around transparency and public trust.

Critical Analysis

The paper provides a valuable case study of the practical challenges involved in deploying differential privacy at scale within a critical government institution. The authors acknowledge several limitations and areas for further research, including:

The need for more nuanced models of stakeholder engagement and participation, beyond the binary of "transparent" versus "opaque" algorithms
The difficulty of communicating complex technical concepts like differential privacy to diverse audiences with varying levels of statistical and computational literacy
The potential for differential privacy to introduce new biases or skew the representativeness of census data, which requires ongoing monitoring and adjustment

Additionally, the paper does not delve deeply into some of the broader societal implications and ethical considerations of deploying differential privacy in the census, such as the potential impact on marginalized communities or the tradeoffs between individual privacy and collective benefits.

Further research could explore these areas in more depth, as well as investigating alternative privacy-preserving data analysis techniques and their relative merits and drawbacks.

Conclusion

This paper provides valuable insights into the practical challenges of transitioning a critical government data collection and reporting system like the U.S. Census to use advanced privacy-preserving techniques like differential privacy. The authors' use of the "handoff" lens highlights the importance of stakeholder engagement, algorithmic transparency, and ongoing monitoring and adjustment as organizations navigate these complex technical and sociopolitical issues.

The lessons learned from the Census Bureau's experience offer important guidance for other government agencies, researchers, and policymakers who are grappling with the need to balance individual privacy, data accuracy, and public trust in an era of increasing data collection and computational power. Continued dialogue and collaboration will be essential as these issues evolve.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

Centering Policy and Practice: Research Gaps around Usable Differential Privacy

Rachel Cummings, Jayshree Sarathy

As a mathematically rigorous framework that has amassed a rich theoretical literature, differential privacy is considered by many experts to be the gold standard for privacy-preserving data analysis. Others argue that while differential privacy is a clean formulation in theory, it poses significant challenges in practice. Both perspectives are, in our view, valid and important. To bridge the gaps between differential privacy's promises and its real-world usability, researchers and practitioners must work together to advance policy and practice of this technology. In this paper, we outline pressing open questions towards building usable differential privacy and offer recommendations for the field, such as developing risk frameworks to align with user needs, tailoring communications for different stakeholders, modeling the impact of privacy-loss parameters, investing in effective user interfaces, and facilitating algorithmic and procedural audits of differential privacy systems.

6/19/2024

cs.CR cs.CY cs.HC

ATTAXONOMY: Unpacking Differential Privacy Guarantees Against Practical Adversaries

Rachel Cummings, Shlomi Hod, Jayshree Sarathy, Marika Swanberg

Differential Privacy (DP) is a mathematical framework that is increasingly deployed to mitigate privacy risks associated with machine learning and statistical analyses. Despite the growing adoption of DP, its technical privacy parameters do not lend themselves to an intelligible description of the real-world privacy risks associated with that deployment: the guarantee that most naturally follows from the DP definition is protection against membership inference by an adversary who knows all but one data record and has unlimited auxiliary knowledge. In many settings, this adversary is far too strong to inform how to set real-world privacy parameters. One approach for contextualizing privacy parameters is via defining and measuring the success of technical attacks, but doing so requires a systematic categorization of the relevant attack space. In this work, we offer a detailed taxonomy of attacks, showing the various dimensions of attacks and highlighting that many real-world settings have been understudied. Our taxonomy provides a roadmap for analyzing real-world deployments and developing theoretical bounds for more informative privacy attacks. We operationalize our taxonomy by using it to analyze a real-world case study, the Israeli Ministry of Health's recent release of a birth dataset using DP, showing how the taxonomy enables fine-grained threat modeling and provides insight towards making informed privacy parameter choices. Finally, we leverage the taxonomy towards defining a more realistic attack than previously considered in the literature, namely a distributional reconstruction attack: we generalize Balle et al.'s notion of reconstruction robustness to a less-informed adversary with distributional uncertainty, and extend the worst-case guarantees of DP to this average-case setting.

5/6/2024

cs.CR cs.CY

Unified Locational Differential Privacy Framework

Aman Priyanshu, Yash Maurya, Suriya Ganesh, Vy Tran

Aggregating statistics over geographical regions is important for many applications, such as analyzing income, election results, and disease spread. However, the sensitive nature of this data necessitates strong privacy protections to safeguard individuals. In this work, we present a unified locational differential privacy (DP) framework to enable private aggregation of various data types, including one-hot encoded, boolean, float, and integer arrays, over geographical regions. Our framework employs local DP mechanisms such as randomized response, the exponential mechanism, and the Gaussian mechanism. We evaluate our approach on four datasets representing significant location data aggregation scenarios. Results demonstrate the utility of our framework in providing formal DP guarantees while enabling geographical data analysis.

5/8/2024

cs.AI cs.CY

🤷

A Systematic and Formal Study of the Impact of Local Differential Privacy on Fairness: Preliminary Results

Karima Makhlouf, Tamara Stefanovic, Heber H. Arcolezi, Catuscia Palamidessi

Machine learning (ML) algorithms rely primarily on the availability of training data, and, depending on the domain, these data may include sensitive information about the data providers, thus leading to significant privacy issues. Differential privacy (DP) is the predominant solution for privacy-preserving ML, and the local model of DP is the preferred choice when the server or the data collector are not trusted. Recent experimental studies have shown that local DP can impact ML prediction for different subgroups of individuals, thus affecting fair decision-making. However, the results are conflicting in the sense that some studies show a positive impact of privacy on fairness while others show a negative one. In this work, we conduct a systematic and formal study of the effect of local DP on fairness. Specifically, we perform a quantitative study of how the fairness of the decisions made by the ML model changes under local DP for different levels of privacy and data distributions. In particular, we provide bounds in terms of the joint distributions and the privacy level, delimiting the extent to which local DP can impact the fairness of the model. We characterize the cases in which privacy reduces discrimination and those with the opposite effect. We validate our theoretical findings on synthetic and real-world datasets. Our results are preliminary in the sense that, for now, we study only the case of one sensitive attribute, and only statistical disparity, conditional statistical disparity, and equal opportunity difference.

5/24/2024

cs.LG cs.CR