Toward Privacy in Public Databases

Chawla, Shuchi; Dwork, Cynthia; McSherry, Frank; Smith, Adam; Wee, Hoeteck

doi:10.1007/978-3-540-30576-7_20

Shuchi Chawla¹⁷,
Cynthia Dwork¹⁸,
Frank McSherry¹⁸,
Adam Smith¹⁹ &
…
Hoeteck Wee²⁰

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 3378))

Included in the following conference series:

Theory of Cryptography Conference

3808 Accesses
125 Citations

Abstract

We initiate a theoretical study of the census problem. Informally, in a census individual respondents give private information to a trusted party (the census bureau), who publishes a sanitized version of the data. There are two fundamentally conflicting requirements: privacy for the respondents and utility of the sanitized data. Unlike in the study of secure function evaluation, in which privacy is preserved to the extent possible given a specific functionality goal, in the census problem privacy is paramount; intuitively, things that cannot be learned “safely” should not be learned at all.

An important contribution of this work is a definition of privacy (and privacy compromise) for statistical databases, together with a method for describing and comparing the privacy offered by specific sanitization techniques. We obtain several privacy results using two different sanitization techniques, and then show how to combine them via cross training. We also obtain two utility results involving clustering.

A full version of this paper may be found on the World Wide Web at http://research.microsoft.com/research/sv/DatabasePrivacy/.

Download to read the full chapter text

Chapter PDF

Towards Integrally Private Clustering: Overlapping Clusters for High Privacy Guarantees

Data Privacy

Data Privacy: A Survey of Results

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Agrawal, D., Aggarwal, C.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: Proceedings of the 20th Symposium on Principles of Database Systems (2001)
Google Scholar
Adam, N.R., Wortmann, J.C.: Security-Control Methods for Statistical Databases. A Comparative Study, ACM Computing Surveys 21(4), 515–556 (1989)
Article Google Scholar
Arora, S., Kannan, R.: Learning mixtures of arbitrary Gaussians. In: ACM STOC (2001)
Google Scholar
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 439–450 (2000)
Google Scholar
Beck, L.: A security machanism for statistical database. ACM Transactions on Database Systems (TODS) 5(3), 316–338 (1980)
Article MATH Google Scholar
Chawla, S., Dwork, C., McSherry, F., Talwar, K.: On the Utility of Privacy- Preserving Histograms (November 2004) (in preparation)
Google Scholar
Cox, L.H.: New Results in Disclosure Avoidance for Tabulations. In: International Statistical Institute Proceedings of the 46th Session, Tokyo, pp. 83–84 (1987)
Google Scholar
Dasgupta, S.: Learning mixtures of Gaussians. In: IEEE FOCS (1999)
Google Scholar
Denning, D.: Secure statistical databases with random sample queries. ACM Transactions on Database Systems (TODS) 5(3), 291–315 (1980)
Article MATH Google Scholar
Diaconis, P., Sturmfels, B.: Algebraic Algorithms for Sampling from Conditional Distributions. Annals of Statistics 26(1), 363–397 (1998)
Article MATH MathSciNet Google Scholar
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: Proceedings of the Symposium on Principles of Database Systems, pp. 202–210 (2003)
Google Scholar
Dobra, A., Fienberg, S.E., Trottini, M.: Assessing the risk of disclosure of confidential categorical data, Bayesian Statistics 7, pp. 125–144. Oxford University Press, Oxford (2000)
Google Scholar
Dwork, C.: A Cryptography-Flavored Approach to Privacy in Public Databases. In: lecture at Aladdin Workshop on Privacy in DATA (March 2003), http://www.aladdin.cs.cmu.edu/workshops/privacy/slides/pdf/dwork.pdf
Dwork, C., Naor, M., et al.: Impossibility Results for Privacy-Preserving Data Sanitization (2004) (in preparation)
Google Scholar
Dwork, C., Nissim, K.: Privacy-preserving datamining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)
Google Scholar
Evfimievski, A.V., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Symposium on Principles of Database Systems, pp. 211–222 (2003)
Google Scholar
Füredi, Z., Komlós, J.: The eigenvalues of random symmetric matrices, Combinatorica. Combinatorica 1(3), 233–241 (1981)
Article MATH MathSciNet Google Scholar
Gasarch, W.: A Survey on Private Information Retrieval. BEATCS Computational Complexity Column 82, 72–107 (2004)
MATH MathSciNet Google Scholar
Gavison, R.: Privacy and the Limits of the Law. In: Johnson, D.G., Nissenbaum, H. (eds.) Computers, Ethics, and Social Values, pp. 332–351. Prentice Hall, Englewood Cliffs (1995)
Google Scholar
Goldreich, O.: The Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Goldwasser, S., Micali, S.: Probabilistic Encryption. JCSS 28(2), 270–299 (1984)
MATH MathSciNet Google Scholar
Gusfield, D.: A Graph Theoretic Approach to Statistical Data Security. SIAM Journal on Computing 17(3), 552–571 (1988)
Article MATH MathSciNet Google Scholar
Indyk, P., Motwani, R.: Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (1998)
Google Scholar
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the Privacy Preserving Properties of Random Data Perturbation Techniques. In: Proceedings of the Third ICDM IEEE International Conference on Data Mining (2003)
Google Scholar
Kleinberg, J.M., Papadimitriou, C.H., Raghavan, P.: Auditing Boolean Attributes. J. Comput. Syst. Sci. 66(1), 244–253 (2003)
Article MATH MathSciNet Google Scholar
McSherry, F.: Spectral Partitioning of Random Graphs. In: Proc. 42nd FOCS, pp. 529–537 (2001)
Google Scholar
Raghunathank, T.E., Reiter, J.P., Rubin, D.B.: Multiple Imputation for Statistical Disclosure Limitation. J. Official Statistics 19(1), 1–16 (2003)
Google Scholar
Roque, G.: Application and Analysis of the Mixture-of-Normals Approach to Masking Census Public-use Microdata (2003) (Manuscript)
Google Scholar
Rubin, D.B.: Discussion: Statistical Disclosure Limitation. Journal of Official Statistics 9(2), 461–468 (1993)
MATH Google Scholar
Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal 16(1), 30–34 (1973)
Article MathSciNet Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Article MATH MathSciNet Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)
Article MATH MathSciNet Google Scholar
Vempala, S., Wang, G.: A spectral algorithm for learning mixtures of distributions. In: IEEE FOCS (2002)
Google Scholar
Winkler, W.E.: Masking and re-identification methods for public-use microdata: Overview and research problems. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 231–246. Springer, Heidelberg (2004)
Chapter Google Scholar
Winkler, W.E.: Re-identification methods for masked microdata. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 216–230. Springer, Heidelberg (2004)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University,
Shuchi Chawla
Microsoft Research SVC,
Cynthia Dwork & Frank McSherry
Weizmann Institute of Science,
Adam Smith
University of California, Berkeley
Hoeteck Wee

Authors

Shuchi Chawla
View author publications
You can also search for this author in PubMed Google Scholar
Cynthia Dwork
View author publications
You can also search for this author in PubMed Google Scholar
Frank McSherry
View author publications
You can also search for this author in PubMed Google Scholar
Adam Smith
View author publications
You can also search for this author in PubMed Google Scholar
Hoeteck Wee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Rutgers University, New Brunswick, NJ, USA
Joe Kilian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H. (2005). Toward Privacy in Public Databases. In: Kilian, J. (eds) Theory of Cryptography. TCC 2005. Lecture Notes in Computer Science, vol 3378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30576-7_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-30576-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24573-5
Online ISBN: 978-3-540-30576-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Toward Privacy in Public Databases

Abstract

Chapter PDF

Similar content being viewed by others

Towards Integrally Private Clustering: Overlapping Clusters for High Privacy Guarantees

Data Privacy

Data Privacy: A Survey of Results

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Toward Privacy in Public Databases

Abstract

Chapter PDF

Similar content being viewed by others

Towards Integrally Private Clustering: Overlapping Clusters for High Privacy Guarantees

Data Privacy

Data Privacy: A Survey of Results

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation