Abstract
The collection of digital information by governments, corporations, and individuals has created tremendous opportunities for knowledge- and information-based decision making. Driven by mutual benefits, or by regulations that require certain data to be published, there is a demand for the exchange and publication of data among various parties. Data in its original form, however, typically contains sensitive information about individuals, and publishing such data will violate individual privacy. The current practice in data publishing relies mainly on policies and guidelines as to what types of data can be published and on agreements on the use of published data. This approach alone may lead to excessive data distortion or insufficient protection. Privacy-preserving data publishing (PPDP) provides methods and tools for publishing useful information while preserving data privacy. Recently, PPDP has received considerable attention in research communities, and many approaches have been proposed for different data publishing scenarios. In this survey, we will systematically summarize and evaluate different approaches to PPDP, study the challenges in practical data publishing, clarify the differences and requirements that distinguish PPDP from other related problems, and propose future research directions.
- Abul, O., Bonchi, F., and Nanni, M. 2008. Never walk alone: Uncertainty for anonymity in moving objects databases. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE). 376--385. Google ScholarDigital Library
- Adam, N. R. and Wortman, J. C. 1989. Security control methods for statistical databases. ACM Comput. Surv. 21, 4, 515--556. Google ScholarDigital Library
- Aggarwal, C. C. and Yu, P. S. 2008a. A framework for condensation-based anonymization of string data. Data Min. Knowl. Discov. 13, 3, 251--275. Google ScholarDigital Library
- Aggarwal, C. C. and Yu, P. S. 2008b. On static and dynamic methods for condensation-based privacy-preserving data mining. ACM Trans. Datab. Syst. 33, 1. Google ScholarDigital Library
- Aggarwal, C. C. and Yu, P. S. 2008c. Privacy-Preserving Data Mining: Models and Algorithms. Springer, Berlin. Google ScholarDigital Library
- Aggarwal, C. C. and Yu, P. S. 2007. On privacy-preservation of text and sparse binary data with sketches. In Proceedings of the SIAM International Conference on Data Mining (SDM).Google Scholar
- Aggarwal, C. C., Pei, J., and Zhang, B. 2006. On privacy preservation against adversarial data mining. In Proceedings of the 12th ACM SIGKDD. ACM, New York. Google ScholarDigital Library
- Aggarwal, C. C. 2005. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st Conference on Very Large Data Bases (VLDB). 901--909. Google ScholarDigital Library
- Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A. 2006. Achieving anonymity via clustering. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART PODS Conference. ACM, New York. Google ScholarDigital Library
- Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., and Zhu, A. 2005. Anonymizing tables. In Proceedings of the 10th International Conference on Database Theory (ICDT). 246--258. Google ScholarDigital Library
- Agrawal, D. and Aggarwal, C. C. 2001. On the design and quantification of privacy preserving data-mining algorithms. In Proceedings of the 20th ACM Symposium on Principles of Database Systems (PODS). ACM, New York, 247--255. Google ScholarDigital Library
- Agrawal, R. and Srikant, R. 2000. Privacy preserving data mining. In Proceedings of the ACM SIGMOD. ACM, New York, 439--450. Google ScholarDigital Library
- Agrawal, S. and Haritsa, J. R. 2005. A framework for high-accuracy privacy-preserving mining. In Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE). 193--204. Google ScholarDigital Library
- Alon, N., Matias, Y., and Szegedy, M. 1999. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 1, 137--147. Google ScholarDigital Library
- Atzori, M., Bonchi, F., Giannotti, F., and Pedreschi, D. 2008. Anonymity preserving pattern discovery. Int. J. Very Large Data Bases 17, 4, 703--727. Google ScholarDigital Library
- Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D., and Abul, O. 2007. Privacy-aware knowledge discovery from location data. In Proceedings of the International Workshop on Privacy-Aware Location-based Mobile Services (PALMS). 283--287. Google ScholarDigital Library
- Barak, B., Chaudhuri, K., Dwork, C., Kale, S., Mcsherry, F., and Talwar, K. 2007. Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In Proceedings of the 26th ACM Symposium on Principles of Database Systems (PODS). ACM, New York, 273--282. Google ScholarDigital Library
- Barbaro, M. and Zeller, T. 2006. A face is exposed for AOL searcher no. 4417749. New York Times (Aug. 9).Google Scholar
- Bayardo, R. J. and Agrawal, R. 2005. Data privacy through optimal k-anonymization. In Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE). 217--228. Google ScholarDigital Library
- Beinat, E. 2001. Privacy and location-based: Stating the policies clearly. GeoInformatics.Google Scholar
- Blum, A., Ligett, K., and Roth, A. 2008. A learning theory approach to non-interactive database privacy. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC). ACM, New York, 609--618. Google ScholarDigital Library
- Blum, A., Dwork, C., McSherry, F., and Nissim, K. 2005. Practical privacy: The sulq framework. In Proceedings of the 24th ACM Symposium on Principles of Database Systems (PODS). ACM, New York, 128--138. Google ScholarDigital Library
- Brand, R. 2002. Microdata protection through noise addition. In Inference Control in Statistical Databases, From Theory to Practice, London, 97--116. Google ScholarDigital Library
- Bu, Y., Fu, A. W. C., Wong, R. C. W., Chen, L., and Li, J. 2008. Privacy preserving serial data publishing by role composition. Proc. VLDB Endowment 1, 1, 845--856. Google ScholarDigital Library
- Burnett, L., Barlow-Stewart, K., Pros, A., and Aizenberg, H. 2003. The gene trustee: A universal identification system that ensures privacy and confidentiality for human genetic databases. J. Law and Medicine 10, 506--513.Google Scholar
- Byun, J.-W., Sohn, Y., Bertino, E., and Li, N. 2006. Secure anonymization for incremental datasets. In Proceedings of the VLDB Workshop on Secure Data Management (SDM). Google ScholarDigital Library
- Carlisle, D. M., Rodrian, M. L., and Diamond, C. L. 2007. California inpatient data reporting manual, medical information reporting for California (5th Ed), Tech. rep., Office of Statewide Health Planning and Development.Google Scholar
- Chakaravarthy, V. T., Gupta, H., Roy, P., and Mohania, M. 2008. Efficient techniques for documents sanitization. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM). ACM, New York. Google ScholarDigital Library
- Chaum, D. 1981. Untraceable electronic mail, return addresses, and digital pseudonyms. Comm. ACM 24, 2, 84--88. Google ScholarDigital Library
- Chawla, S., Dwork, C., McSherry, F., Smith, A., and Wee, H. 2005. Toward privacy in public databases. In Proceedings of the Theory of Cryptography Conference (TCC). 363--385. Google ScholarDigital Library
- Chawla, S., Dwork, C., McSherry, F., and Talwar, K. 2005. On privacy-preserving histograms. In Proceedings of the Uncertainty in Artificial Intelligence Coference (UAI).Google Scholar
- Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., and Zhu, M. Y. 2002. Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4, 2, 28--34. Google ScholarDigital Library
- Clifton, C. 2000. Using sample size to limit exposure to data mining. J. Comput. Security 8, 4, 281--307. Google ScholarDigital Library
- Cox, L. H. 1980. Suppression methodology and statistical disclosure control. J. Am. Statistical Assoc. 75, 370, 377--385.Google ScholarCross Ref
- Dalenius, T. 1986. Finding a needle in a haystack - or identifying anonymous census record. J. Official Statistics 2, 3, 329--336.Google Scholar
- Dalenius, T. 1977. Towards a methodology for statistical disclosure control. Statistik Tidskrift 15, 429--444.Google Scholar
- Denning, D. E. 1985. Commutative filters for reducing inference threats in multilevel database systems. In Proceedings of the IEEE Symposium on Security and Privacy.Google ScholarCross Ref
- Deutsch, A. and Papakonstantinou, Y. 2005. Privacy in database publishing. In Proceedings of the 10th International Conference on Database Theory (ICDT). 230--245. Google ScholarDigital Library
- Dinur, I. and Nissim, K. 2003. Revealing information while preserving privacy. In Proceedings of the 22nd ACM Symposium on Principles of Database Systems (PODS). 202--210. Google ScholarDigital Library
- Domingo-Ferrer, J. 2008. Privacy-Preserving Data Mining: Models and Algorithms. Springer, Berlin, 53--80. Google ScholarDigital Library
- Domingo-Ferrer, J. and Torra, V. 2008. A critique of k-anonymity and some of its enhancements. In Proceedings of the 3rd International Conference on Availability, Reliability and Security (ARES). 990--993. Google ScholarDigital Library
- Domingo-Ferrer, J. and Torra, V. 2002. Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam, 113--134.Google Scholar
- Domingo-Ferrer, J. 2001. Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, 91--11.Google Scholar
- Du, W. and Zhan, Z. 2003. Using randomized response techniques for privacy-preserving data mining. In Proceedings of the 9th ACM SIGKDD. ACM, New York. Google ScholarDigital Library
- Duncan, G. and Fienberg, S. 1998. Obtaining information while preserving privacy: A Markov perturbation method for tabular data. In Statistical Data Protection, 351--362.Google Scholar
- Dwork, C. 2008. Differential privacy: A survey of results. In Proceedings of the 5th International Conference on Theory and Applications of Models of Computation (TAMC). 1--19. Google ScholarDigital Library
- Dwork, C. 2007. Ask a better question, get a better answer: A new approach to private data analysis. In Proceedings of the International Conference on Database Theory (ICDT). 18--27. Google ScholarDigital Library
- Dwork, C. 2006. Differential privacy. In Proceedings of the 33rd International Colloquium on Automata, Languages and Programming (ICALP). 1--12. Google ScholarDigital Library
- Dwork, C., McSherry, F., Nissim, K., and Smith, A. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3rd Theory of Cryptography Conference (TCC). 265--284. Google ScholarDigital Library
- Dwork, C. and Nissim, K. 2004. Privacy-preserving data mining on vertically partitioned databases. In Proceedings of the 24th International Cryptology Conference (CRYPTO). 528--544.Google Scholar
- Emam, K. E. 2006. Data anonymization practices in clinical research: A descriptive study. Tech. rep. Access to Information and Privacy Division of Health in Canada.Google Scholar
- Evfimievski, A., Fagin, R., and Woodruff, D. P. 2008. Epistemic privacy. In Proceedings of the 27th ACM Symposium on Principles of Database Systems (PODS). ACM, New York, 171--180. Google ScholarDigital Library
- Evfimievski, A., Srikant, R., Agrawal, R., and Gehrke, J. 2002. Privacy preserving mining of association rules. In Proceedings of the 8th ACM SIGKDD. ACM, New York, 217--228. Google ScholarDigital Library
- Farkas, C. and Jajodia, S. 2003. The inference problem: A survey. ACM SIGKDD Explor. Newsl. 4, 2, 6--11. Google ScholarDigital Library
- Fuller, W. A. 1993. Masking procedures for microdata disclosure limitation. Official Statistics 9, 2, 383--406.Google Scholar
- Fung, B. C. M., Cao, M., Desai, B. C., and Xu, H. 2009. Privacy protection for RFID data. In Proceedings of the 24th ACM SIGAPP Symposium on Applied Computing (SAC). ACM, New York. Google ScholarDigital Library
- Fung, B. C. M., Wang, K., Wang, L., and Hung, P. C. K. 2009. Privacy-preserving data publishing for cluster analysis. Data Knowl. Engin. 68, 6, 552--575. Google ScholarDigital Library
- Fung, B. C. M., Wang, K., Fu, A. W. C., and Pei, J. 2008. Anonymity for continuous data publishing. In Proceedings of the 11th International Conference on Extending Database Technology (EDBT). ACM, New York, 264--275. Google ScholarDigital Library
- Fung, B. C. M., Wang, K., Wang, L., and Debbabi, M. 2008. A framework for privacy-preserving cluster analysis. In Proceedings of the 2008 IEEE International Conference on Intelligence and Security Informatics (ISI). 46--51.Google Scholar
- Fung, B. C. M., Wang, K., and Yu, P. S. 2007. Anonymizing classification data for privacy preservation. IEEE Trans. Knowl. Data Engin. 19, 5, 711--725. Google ScholarDigital Library
- Fung, B. C. M., Wang, K., and Yu, P. S. 2005. Top-down specialization for information and privacy preservation. In Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE). 205--216. Google ScholarDigital Library
- Gehrke, J. 2006. Models and methods for privacy-preserving data publishing and analysis. Tutorial at the 12th ACM SIGKDD.Google Scholar
- Ghinita, G., Tao, Y., and Kalnis, P. 2008. On the anonymization of sparse high-dimensional data. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE). 715--724. Google ScholarDigital Library
- Goguen, J. and Meseguer, J. 1984. Unwinding and inference control. In Proceedings of the IEEE Symposium on Security and Privacy.Google Scholar
- Hegland, M., Mcintosh, I., and Turlach, B. A. 1999. A parallel solver for generalized additive models. Comput. Statistics Data Anal. 31, 4, 377--396. Google ScholarDigital Library
- Hengartner, U. 2007. Hiding location information from location-based services. In Proceedings of the International Workshop on Privacy-Aware Location-based Mobile Services (PALMS). 268--272. Google ScholarDigital Library
- Hinke, T. 1988. Inference aggregation detection in database management systems. In Proceedings of the IEEE Symposium on Security and Privacy. 96--107.Google ScholarCross Ref
- Hinke, T., Degulach, H., and Chandrasekhar, A. 1995. A fast algorithm for detecting second paths in database inference analysis. J. Comput. Security.Google ScholarCross Ref
- Huang, Z., Du, W., and Chen, B. 2005. Deriving private information from randomized data. In Proceedings of the ACM SIGMOD. ACM, New York, 37--48. Google ScholarDigital Library
- Hundepool, A. and Willenborg, L. 1996. 1- and ¿-argus: Software for statistical disclosure control. In Proceedings of the 3rd International Seminar on Statistical Confidentiality.Google Scholar
- Iyengar, V. S. 2002. Transforming data to satisfy privacy constraints. In Proceedings of the 8th ACM SIGKDD. ACM, New York, 279--288. Google ScholarDigital Library
- Jajodia, S. and Meadows, C. 1995. Inference problems in multilevel database management systems. In IEEE Information Security: An Integrated Collection of Essays. 570--584.Google Scholar
- Jakobsson, M., Juels, A., and Rivest, R. L. 2002. Making mix nets robust for electronic voting by randomized partial checking. In Proceedings of the 11th USENIX Security Symposium. 339--353. Google ScholarDigital Library
- Jiang, W. and Clifton, C. 2005. Privacy-preserving distributed k-anonymity. In Proceedings of the 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security. 166--177. Google ScholarDigital Library
- Jiang, W. and Clifton, C. 2006. A secure distributed framework for achieving k-anonymity. Very Large Data Bases J. 15, 4, 316--333. Google ScholarDigital Library
- Kantarcioglu, M. 2008. Privacy-Preserving Data Mining: Models and Algorithms. Springer, Berlin, 313--335. Google ScholarDigital Library
- Kantarcioglu, M., Jin, J., and Clifton, C. 2004. When do data mining results violate privacy? In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, 599--604. Google ScholarDigital Library
- Kargupta, H., Datta, S., Wang, Q., and Sivakumar, K. 2003. On the privacy preserving properties of random data perturbation techniques. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM). 99--106. Google ScholarDigital Library
- Kenthapadi, K., Mishra, N., and Nissim, K. 2005. Simulatable auditing. In Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM, New York, 118--127. Google ScholarDigital Library
- Kifer, D. and Gehrke, J. 2006. Injecting utility into anonymized datasets. In Proceedings of ACM SIGMOD. ACM, New York. Google ScholarDigital Library
- Kim, J. and Winkler, W. 1995. Masking microdata files. In Proceedings of the ASA Section on Survey Research Methods. 114--119.Google Scholar
- Kokkinakis, D. and Thurin, A. 2007. Anonymization of Swedish clinical data. In Proceedings of the 11th Conference on Artificial Intelligence in Medicine (AIME). 237--241. Google ScholarDigital Library
- Kumar, R., Novak, J., Pang, B., and Tomkins, A. 2007. On anonymizing query logs via token-based hashing. In Proceedings of the 16th World Wide Wed Conference. 628--638. Google ScholarDigital Library
- Lefevre, K., Dewitt, D. J., and Ramakrishnan, R. 2006a. Mondrian multidimensional k-anonymity. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Lefevre, K., Dewitt, D. J., and Ramakrishnan, R. 2006b. Workload-aware anonymization. In Proceedings of the 12th ACM SIGKDD. ACM, New York. Google ScholarDigital Library
- Lefevre, K., Dewitt, D. J., and Ramakrishnan, R. 2005. Incognito: Efficient full-domain k-anonymity. In Proceedings of ACM SIGMOD. ACM, New York, 49--60. Google ScholarDigital Library
- Li, J., Tao, Y., and Xiao, X. 2008. Preservation of proximity privacy in publishing numerical sensitive data. In Proceedings of the ACM Conference on Management of Data (SIGMOD). 437--486. Google ScholarDigital Library
- Li, N., Li, T., and Venkatasubramanian, S. 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE).Google Scholar
- Machanavajjhala, A., Kifer, D., Abowd, J. M., Gehrke, J., and Vilhuber, L. 2008. Privacy: Theory meets practice on the map. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE). 277--286. Google ScholarDigital Library
- Machanavajjhala, A., Kifer, D., Gehrke, J., and Venkitasubramaniam, M. 2007. l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1, 1. Google ScholarDigital Library
- Machanavajjhala, A., Gehrke, J., Kifer, D., and Venkitasubramaniam, M. 2006. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Malin, B. and Airoldi, E. 2006. The effects of location access behavior on re-identification risk in a distributed environment. In Proceedings of the 6th Workshop on Privacy Enhancing Technologies (PET). 413--429. Google ScholarDigital Library
- Martin, D., Kifer, D., Machanavajjhala, A., Gehrke, J., and Halpern, J. 2007. Worst-case background knowledge in privacy-preserving data publishing. In Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE).Google Scholar
- Matloff, N. S. 1988. Inference control via query restriction vs. data modification: A perspective. In Database Security: Status and Prospects. 159--166. Google ScholarDigital Library
- Meyerson, A. and Williams, R. 2004. On the complexity of optimal k-anonymity. In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART PODS. ACM, New York, 223--228. Google ScholarDigital Library
- Miklau, G. and Suciu, D. 2004. A formal analysis of information disclosure in data exchange. In Proceedings of the ACM SIGMOD. ACM, New York, 575--586. Google ScholarDigital Library
- Mohammed, N., Fung, B. C. M., Wang, K., and Hung, P. C. K. 2009. Privacy-preserving data mashup. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT). Google ScholarDigital Library
- Moore, R. A., Jr. 1996. Controlled data-swapping techniques for masking public use microdata sets. Statistical Research Division Report Series RR 96-04, U.S. Bureau of the Census, Washington, D.C.Google Scholar
- Motwani, R. and Xu, Y. 2007. Efficient algorithms for masking and finding quasi-identifiers. In Proceedings of the Conference on Very Large Data Bases (VLDB).Google Scholar
- Nergiz, M. E., Atzori, M., and Clifton, C. W. 2007. Hiding the presence of individuals from shared databases. In Proceedings of ACM SIGMOD Conference. ACM, New York, 665--676. Google ScholarDigital Library
- Nergiz, M. E. and Clifton, C. 2007. Thoughts on k-anonymization. Data Knowl. Engin. 63, 3, 622--645. Google ScholarDigital Library
- Nergiz, M. E., Clifton, C., and Nergiz, A. E. 2007. Multirelational k-anonymity. In Proceedings of the 23rd International Conference on Data Engineering (ICDE). 1417--1421.Google Scholar
- Ohrn, A. and Ohno-Machado, L. 1999. Using Boolean reasoning to anonymize databases. Artif. Intell. Medicine 15, 235--254.Google ScholarCross Ref
- Ozsoyoglu, G. and Su, T. 1990. On inference control in semantic data models for statistical databases. J. Comput. Syst. Sci. 40, 3, 405--443. Google ScholarDigital Library
- Papadimitriou, S., Li, F., Kollios, G., and Yu, P. S. 2007. Time series compressibility and privacy. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB), 459--470. Google ScholarDigital Library
- Pinkas, B. 2002. Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explor. Newsl. 4, 2, 12--19. Google ScholarDigital Library
- Pohlig, S. and Hellman, M. 1978. An improved algorithm for computing logarithms over gf(p) and its cryptographic significance. IEEE Trans. Inform. Theory IT-24, 106--110.Google ScholarDigital Library
- President Information Technology Advisory Committee. 2004. Revolutionizing health care through information technology. Tech. rep., Executive Office of the President of the United States.Google Scholar
- Quinlan, J. R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann. Google ScholarDigital Library
- Rastogi, V., Suciu, D., and Hong, S. 2007. The boundary between privacy and utility in data publishing. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB). 531--542. Google ScholarDigital Library
- Reiss, S. P. 1984. Practical data-swapping: The first steps. ACM Trans. Datab. Syst. 9, 1, 20--37. Google ScholarDigital Library
- Reiss, S. P., Post, M. J., and Dalenius, T. 1982. Non-reversible privacy transformations. In Proceedings of the 1st ACM Symposium on Principles of Database Systems (PODS). 139--146. Google ScholarDigital Library
- Rosen, B. E., Goodwin, J. M., and Vidal, J. J. 1992. Process control with adaptive range coding. Biological Cyber. 67, 419--428. Google ScholarDigital Library
- Rubin, D. B. Discussion statistical disclosure limitation. J. Official Statistics 9, 2.Google Scholar
- Samarati, P. 2001. Protecting respondents' identities in microdata release. IEEE Trans. Knowl. Data Engin. 13, 6, 1010--1027. Google ScholarDigital Library
- Samarati, P. and Sweeney, L. 1998a. Generalizing data to provide anonymity when disclosing information. In Proceedings of the 17th ACM SIGACT-SIGMOD-SIGART (PODS). ACM, New York, 188. Google ScholarDigital Library
- Samarati, P. and Sweeney, L. 1998b. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Tech. rep., SRI International.Google Scholar
- Saygin, Y., Hakkani-Tur, D., and Tur, G. 2006. Web and Information Security. IRM Press, 133--148.Google Scholar
- Shannon, C. E. 1948. A mathematical theory of communication. The Bell Syst. Tech. J. 27, 379 and 623.Google ScholarCross Ref
- Skowron, A. and Rauszer, C. 1992. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory. Google ScholarDigital Library
- Sweeney, L. 2002a. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertainty, Fuzziness, Knowl.-Based Syst. 10, 5, 571--588. Google ScholarDigital Library
- Sweeney, L. 2002b. k-Anonymity: A model for protecting privacy. Int. J. Uncertainty, Fuzziness, Knowl.-Based Syst. 10, 557--570. Google ScholarDigital Library
- Sweeney, L. 1998. Datafly: A system for providing anonymity in medical data. In Proceedings of the IFIP TC11 WG11.3 11th International Conference on Database Securty XI: Status and Prospects. 356--381. Google ScholarDigital Library
- Terrovitis, M. and Mamoulis, N. 2008. Privacy preservation in the publication of trajectories. In Proceedings of the 9th International Conference on Mobile Data Management (MDM). 65--72. Google ScholarDigital Library
- Terrovitis, M., Mamoulis, N., and Kalnis, P. 2008. Privacy-preserving anonymization of set-valued data. Proc. VLDB Endowment 1, 1, 115--125. Google ScholarDigital Library
- Thuraisingham, B. M. 1987. Security checking in relational database management systems augmented with inference engines. Comput. Security 6, 479--492. Google ScholarDigital Library
- Truta, T. M. and Bindu, V. 2006. Privacy protection: p-sensitive k-anonymity property. In Proceedings of the Workshop on Privacy Data Management (PDM). 94. Google ScholarDigital Library
- Vaidya, J. 2008. Privacy-Preserving Data Mining: Models and Algorithms. Springer, Berlin, 337--358. Google ScholarDigital Library
- Verykios, V. S., Elmagarmid, A. K., Bertino, E., Saygin, Y., and Dasseni, E. 2004. Association rule hiding. IEEE Trans. Knowl. Data Engin. 16, 4, 434--447. Google ScholarDigital Library
- Vinterbo, S. A. 2004. Privacy: A machine learning view. IEEE Trans. Knowl. Data Engin. 16, 8, 939--948. Google ScholarDigital Library
- Wang, K., Xu, Y., Fu, A. W. C., and Wong, R. C. W. 2009. ff-anonymity: When quasi-identifiers are missing. In Proceedings of the 25th IEEE International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Wang, K., Fung, B. C. M., And Yu, P. S. 2007. Handicapping attacker's confidence: An alternative to k-anonymization. Knowl. Inform. Syst. 11, 3, 345--368. Google ScholarDigital Library
- Wang, K. and Fung, B. C. M. 2006. Anonymizing sequential releases. In Proceedings of the 12th ACM SIGKDD Conference. ACM, New York. Google ScholarDigital Library
- Wang, K., Fung, B. C. M., and Dong, G. 2005. Integrating private databases for data analysis. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI). 171--182. Google ScholarDigital Library
- Wang, K., Fung, B. C. M., and Yu, P. S. 2005. Template-based privacy preservation in classification problems. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM). 466--473. Google ScholarDigital Library
- Wang, K., Yu, P. S., and Chakraborty, S. 2004. Bottom-up generalization: A data mining solution to privacy protection. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM). Google ScholarDigital Library
- Wang, S.-W., Chen, W.-H., Ong, C.-S., Liu, L., and Chuang, Y. 2006. RFID applications in hospitals: A case study on a demonstration RFID project in a Taiwan hospital. In Proceedings of the 39th Hawaii International Conference on System Sciences. Google ScholarDigital Library
- Warner, S. L. 1965. Randomized response: A survey technique for eliminating evasive answer bias. J. Am. Statistical Assoc. 60, 309, 63--69.Google ScholarCross Ref
- Wong, R. C. W., Fu, A. W. C., Wang, K., and Pei, J. 2007. Minimality attack in privacy preserving data publishing. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB). 543--554. Google ScholarDigital Library
- Wong, R. C. W., Li., J., Fu, A. W. C., and Wang, K. 2006. (a,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In Proceedings of the 12th ACM SIGKDD. ACM, New York, 754--759. Google ScholarDigital Library
- Wright, R. N., Yang, Z., and Zhong, S. 2005. Distributed data mining protocols for privacy: A review of some recent results. In Proceedings of the Secure Mobile Ad-Hoc Networks and Sensors Workshop (MADNES). Google ScholarDigital Library
- Xiao, X. and Tao, Y. 2007. m-invariance: Towards privacy preserving re-publication of dynamic datasets. In Proceedings of the ACM SIGMOD Conference. ACM, New York. Google ScholarDigital Library
- Xiao, X. and Tao, Y. 2006a. Anatomy: Simple and effective privacy preservation. In Proceedings of the 32nd Conference on Very Large Data Bases (VLDB). Google ScholarDigital Library
- Xiao, X. and Tao, Y. 2006b. Personalized privacy preservation. In Proceedings of the ACM SIGMOD Conference. ACM, New York. Google ScholarDigital Library
- Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., and Fu, A. W. C. 2006. Utility-based anonymization using local recoding. In Proceedings of the 12th ACM SIGKDD Conference. ACM, New York. Google ScholarDigital Library
- Xu, Y., Fung, B. C. M., Wang, K., Fu, A. W. C., and Pei, J. 2008. Publishing sensitive transactions for itemset utility. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM). Google ScholarDigital Library
- Xu, Y., Wang, K., Fu, A. W. C., and Yu, P. S. 2008. Anonymizing transaction databases for publication. In Proceedings of the 14th ACM SIGKDD Conference. ACM, New York. Google ScholarDigital Library
- Yang, Z., Zhong, S., and Wright, R. N. 2005. Anonymity-preserving data collection. In Proceedings of the 11th ACM SIGKDD Conference. ACM, New York, 334--343. Google ScholarDigital Library
- Yao, C., Wang, X. S., and Jajodia, S. 2005. Checking for k-anonymity violation by views. In Proceedings of the 31st Conference on Very Large Data Bases (VLDB). 910--921. Google ScholarDigital Library
- You, T.-H., Peng, W.-C., and Lee, W.-C. 2007. Protect moving trajectories with dummies. In Proceedings of the International Workshop on Privacy-Aware Location-Based Mobile Services (PALMS). 278--282. Google ScholarDigital Library
- Zayatz, L. 2007. Disclosure avoidance practices and research at the U.S. Census Bureau: An update. J. Official Statistics 23, 2, 253--265.Google Scholar
- Zhang, P., Tong, Y., Tang, S., and Yang, D. 2005. Privacy-preserving naive Bayes classifier. Lecture Notes in Computer Science, vol. 3584. Google ScholarDigital Library
- Zhang, Q., Koudas, N., Srivastava, D., and Yu, T. 2007. Aggregate query answering on anonymized tables. In Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE).Google Scholar
Index Terms
- Privacy-preserving data publishing: A survey of recent developments
Recommendations
Privacy protection for RFID data
SAC '09: Proceedings of the 2009 ACM symposium on Applied ComputingRadio Frequency IDentification (RFID) is a technology of automatic object identification. Retailers and manufacturers have created compelling business cases for deploying RFID in their supply chains. Yet, the uniquely identifiable objects pose a privacy ...
Privacy preserving data obfuscation for inherently clustered data
Privacy is defined as the freedom from unauthorised intrusion. The availability of public records along with intelligent search engines and data mining tools allow easy access to useful information. They also serve as a haven for individuals with ...
A Survey on Privacy Preserving Dynamic Data Publishing
Many organizations, especially small and medium business SMB enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy ...
Comments