Related papers: The 2020 Census Disclosure Avoidance System TopDow…
The 2020 Decennial Census will be released with a new disclosure avoidance system in place, putting differential privacy in the spotlight for a wide range of data users. We consider several key applications of Census data in redistricting,…
In "The 2020 Census Disclosure Avoidance System TopDown Algorithm," Abowd et al. (2022) describe the concepts and methods used by the Disclosure Avoidance System (DAS) to produce formally private output in support of the 2020 Census…
The US Census Bureau Disclosure Avoidance System (DAS) balances confidentiality and utility requirements for the decennial US Census (Abowd et al., 2022). The DAS was used in the 2020 Census to produce demographic datasets critically used…
To protect the confidentiality of the 2020 Census, the U.S. Census Bureau adopted a statistical disclosure limitation framework based on the principles of differential privacy. A key component was the TopDown Algorithm, which applied…
This paper extends $\texttt{InfTDA}$, a mechanism proposed in (Boninsegna, Silvestri, PETS 2025) for mobility datasets with origin and destination trips, in a general setting. The algorithm presented in this paper works for any dataset of…
Protecting an individual's privacy when releasing their data is inherently an exercise in relativity, regardless of how privacy is qualified or quantified. This is because we can only limit the gain in information about an individual…
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the Detailed Demographic and Housing Characteristics File A (Detailed DHC-A) of the 2020 Census. The tabulations contain statistics…
In 2018, the US Census Bureau designed a new data reconstruction and re-identification attack and tested it against their 2010 data release. The specific attack executed by the Bureau allows an attacker to infer the race and ethnicity of…
The United States Census Bureau faces a difficult trade-off between the accuracy of Census statistics and the protection of individual information. We conduct the first independent evaluation of bias and noise induced by the Bureau's two…
This article describes the disclosure avoidance algorithm that the U.S. Census Bureau used to protect the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC). The tabulations contain statistics of counts of U.S.…
Differential privacy (DP) is increasingly used to protect the release of hierarchical, tabular population data, such as census data. A common approach for implementing DP in this setting is to release noisy responses to a predefined set of…
To meet its dual burdens of providing useful statistics and ensuring privacy of individual respondents, the US Census Bureau has for decades introduced some form of "noise" into published statistics. Initially, they used a method known as…
The US Census Bureau plans to protect the privacy of 2020 Census respondents through its Disclosure Avoidance System (DAS), which attempts to achieve differential privacy guarantees by adding noise to the Census microdata. By applying…
This article describes SafeTab-H, a disclosure avoidance algorithm applied to the release of the U.S. Census Bureau's Detailed Demographic and Housing Characteristics File B (Detailed DHC-B) as part of the 2020 Census. The tabulations…
This article describes a proposed differentially private (DP) algorithms that the US Census Bureau is considering to release the Detailed Demographic and Housing Characteristics (DHC) Race & Ethnicity tabulations as part of the 2020 Census.…
In early 2021, the US Census Bureau will begin releasing statistical tables based on the decennial census conducted in 2020. Because of significant changes in the data landscape, the Census Bureau is changing its approach to disclosure…
Disclosure avoidance (DA) systems are used to safeguard the confidentiality of data while allowing it to be analyzed and disseminated for analytic purposes. These methods, e.g., cell suppression, swapping, and k-anonymity, are commonly…
Data from the Decennial Census is published only after applying a disclosure avoidance system (DAS). Data users were shaken by the adoption of differential privacy in the 2020 DAS, a radical departure from past methods. The goal of this…
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for discovery of hidden semantic architecture of text datasets, and plays a fundamental role in many machine learning applications. However, like many other machine…
In 2017, the United States Census Bureau announced that because of high disclosure risk in the methodology (data swapping) used to produce tabular data for the 2010 census, a different protection mechanism based on differential privacy…