ABSTRACT
Privacy preserving data processing has become an important topic recently because of advances in hardware technology which have lead to widespread proliferation of demographic and sensitive data. A rudimentary way to preserve privacy is to simply hide the information in some of the sensitive fields picked by a user. However, such a method is far from satisfactory in its ability to prevent adversarial data mining. Real data records are not randomly distributed. As a result, some fields in the records may be correlated with one another. If the correlation is sufficiently high, it may be possible for an adversary to predict some of the sensitive fields using other fields.In this paper, we study the problem of privacy preservation against adversarial data mining, which is to hide a minimal set of entries so that the privacy of the sensitive fields are satisfactorily preserved. In other words, even by data mining, an adversary still cannot accurately recover the hidden data entries. We model the problem concisely and develop an efficient heuristic algorithm which can find good solutions in practice. An extensive performance study is conducted on both synthetic and real data sets to examine the effectiveness of our approach.
- Aggarwal C. C. and Yu P. S. A Condensation Based Approach to Privacy Preserving Data Mining. EDBT Conference, 2004.Google Scholar
- Agrawal R. and Srikant R. Privacy Preserving Data Mining. Proceedings of the ACM SIGMOD Conference, 2000. Google ScholarDigital Library
- Agrawal D. and Aggarwal C. C. On the Design and Quantification of Privacy Preserving Data Mining Algorithms. ACM PODS Conference, 2002. Google ScholarDigital Library
- Agrawal R. and Bayardo R. J. Data Privacy through Optimal k-anonymization. ICDE Conference, 2005. Google ScholarDigital Library
- Aggarwal C. and Parthasarathy S. Mining Massively Incomplete Data Sets by Conceptual Reconstruction. ACM KDD Conference, 2001. Google ScholarDigital Library
- Clifton C. and Marks D. Security and Privacy Implications of Data Mining. ACM SIGMOD DMKD Workshop, 1996.Google Scholar
- Dalvi N. et al. Adversarial classification. KDD Conference, pp. 99--108, 2004 Google ScholarDigital Library
- Evfimievski A. et al. Privacy Preserving Mining Of Association Rules. ACM KDD Conference, 2002. Google ScholarDigital Library
- Liew C. K. et al. A data distortion by probability distribution. ACM TODS, 10(3):395--411, 1985. Google ScholarDigital Library
- Rizvi S. and Haritsa J. Maintaining Data Privacy in Association Rule Mining. VLDB Conference, 2002. Google ScholarDigital Library
- Rymon R. Search through systematic set enumeration. In Proceedings of KR'92, 1992.Google Scholar
- Samarati P. and Sweeney L. Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement Through Generalization and Suppression. Proc. of the IEEE Symposium on Research in Security and Privacy, May 1998.Google Scholar
- Vaidya J. and Clifton C. Privacy Preserving Association Rule Mining in Vertically Partitioned Data. ACM KDD Conference, 2002. Google ScholarDigital Library
- Verykios V. S. et al. Association Rule Hiding, IEEE TKDE, 16(4), 2004. Google ScholarDigital Library
- Xiong H. et al. Privacy Leakage in Multi-relational Databases via Pattern based Semi-Supervised Learning. Univ. of Minnesotta, Technical Report 04--23, 2004.Google Scholar
Index Terms
- On privacy preservation against adversarial data mining
Recommendations
IMR based Anonymization for Privacy Preservation in Data Mining
KMO '16: Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting SocietyPrivacy Preserving Data Mining (PPDM) is a data mining research area that aims to protect individual's personal information from unsolicited or unauthorized disclosure. Privacy relates to personal information that a person would not wish others to know ...
Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects
ICCCT '12: Proceedings of the 2012 Third International Conference on Computer and Communication TechnologyPrivacy preserving has originated as an important concern with reference to the success of the data mining. Privacy preserving data mining (PPDM) deals with protecting the privacy of individual data or sensitive knowledge without sacrificing the utility ...
Privacy Preservation in Cloud Computing Using Randomized Encoding
AbstractIn this era of Internet, the exchange of data between the users and service providers has grown tremendously. Organizations in health, banking, social network, criminal and government sectors have been collecting and processing the individuals’ ...
Comments