skip to main content
10.1145/3531146.3533073acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash

Authors Info & Claims
Published:20 June 2022Publication History

ABSTRACT

Apple recently revealed its deep perceptual hashing system NeuralHash to detect child sexual abuse material (CSAM) on user devices before files are uploaded to its iCloud service. Public criticism quickly arose regarding the protection of user privacy and the system’s reliability. In this paper, we present the first comprehensive empirical analysis of deep perceptual hashing based on NeuralHash. Specifically, we show that current deep perceptual hashing may not be robust. An adversary can manipulate the hash values by applying slight changes in images, either induced by gradient-based approaches or simply by performing standard image transformations, forcing or preventing hash collisions. Such attacks permit malicious actors easily to exploit the detection system: from hiding abusive material to framing innocent users, everything is possible. Moreover, using the hash values, inferences can still be made about the data stored on user devices. In our view, based on our results, deep perceptual hashing in its current form is generally not ready for robust client-side scanning and should not be used from a privacy perspective.1

References

  1. Hal Abelson, Ross Anderson, Steven M. Bellovin, Josh Benaloh, Matt Blaze, Jon Callas, Whitfield Diffie, Susan Landau, Peter G. Neumann, Ronald L. Rivest, Jeffrey I. Schiller, Bruce Schneier, Vanessa Teague, and Carmela Troncoso. 2021. Bugs in our Pockets: The Risks of Client-Side Scanning. CoRR abs/2110.07450(2021).Google ScholarGoogle Scholar
  2. Mathias Appel. 2017. PIXNIO-226632-4681x3100.jpeg. https://pixnio.com/creative-commons-license, accessed Sept. 23, 2021; Cropped; License: https://pixnio.com/creative-commons-license.Google ScholarGoogle Scholar
  3. Anish Athalye. 2021. NeuralHash Collider. https://github.com/anishathalye/neural-hash-collider, accessed Oct. 12, 2021.Google ScholarGoogle Scholar
  4. Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-tao Xia, and En-hui Yang. 2020. Targeted Attack for Deep Hashing based Retrieval. In ECCV.Google ScholarGoogle Scholar
  5. Mihir Bellare. 2021. The Apple PSI Protocol. https://www.apple.com/child-safety/pdf/Technical_Assessment_of_CSAM_Detection_Mihir_Bellare.pdf, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  6. Patrick Breyer, Alviina Alametsä, Rosa D’Amato, Pernando Barrena, Saskia Bricmont, Antoni Comín, Gwendoline Delbos-Corfield, Francesca Donato, Cornelia Ernst, Claudia Gamon, Markéta Gregorová, Francisco Guerreiro, Svenja Hahn, Irena Joveva, Petra Kammerevert, Marcel Kolaja, Moritz Körner, Karen Melchior, Clara Ponsatí, and Mikuláš Peksa. 2021. Cross-Party Letter of Member of the European Parliament Against General Monitoring. https://www.patrick-breyer.de/wp-content/uploads/2021/11/20211020_Letter_General_Monitoring.pdf, accessed: Nov. 19, 2021.Google ScholarGoogle Scholar
  7. John F. Canny. 1986. A Computational Approach to Edge Detection. IEEE TPAMI 8, 6 (1986), 679–698.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nicholas Carlini and David A. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. IEEE Computer Society, 39–57. https://doi.org/10.1109/SP.2017.49Google ScholarGoogle ScholarCross RefCross Ref
  9. European Commission. 2020. EU strategy for a more effective fight against child sexual abuse. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52020DC0607, accessed: Nov. 16, 2021.Google ScholarGoogle Scholar
  10. Francesco Croce and Matthias Hein. 2020. Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack. In Proceedings of the 37th International Conference on Machine Learning (ICML)(Proceedings of Machine Learning Research, Vol. 119). 2196–2205.Google ScholarGoogle Scholar
  11. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248–255.Google ScholarGoogle Scholar
  12. Brian Dolhansky and Cristian Canton-Ferrer. 2020. Adversarial collision attacks on image hashing functions. CoRR abs/2011.09473(2020).Google ScholarGoogle Scholar
  13. Xiaoyi Dong, Dongdong Chen, Jianmin Bao, Chuan Qin, Lu Yuan, Weiming Zhang, Nenghai Yu, and Dong Chen. 2020. GreedyFool: Distortion-Aware Sparse Adversarial Attack. In Conference on Neural Information Processing Systems (NeurIPS), Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).Google ScholarGoogle Scholar
  14. Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting Adversarial Attacks With Momentum. In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9185–9193.Google ScholarGoogle Scholar
  15. Andrea Drmic, Marin Silic, Goran Delac, Klemo Vladimir, and Adrian S. Kurdija. 2017. Evaluating robustness of perceptual image hashing algorithms. In 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). 995–1000. https://doi.org/10.23919/MIPRO.2017.7973569Google ScholarGoogle ScholarCross RefCross Ref
  16. Noam Eshed. 2020. Novelty detection and analysis in convolutional neural networks. Master’s thesis. Cornell University.Google ScholarGoogle Scholar
  17. Facebook. 2019. Open-Sourcing Photo- and Video-Matching Technology to Make the Internet Safer. https://about.fb.com/news/2019/08/open-source-photo-video-matching, accessed Oct. 18, 2021.Google ScholarGoogle Scholar
  18. Department for Digital, Culture, Media & Sport, United Kingdom Government. 2021. Draft Online Safety Bill. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/985033/Draft_Online_Safety_Bill_Bookmarked.pdf, accessed: Nov. 16, 2021.Google ScholarGoogle Scholar
  19. National Center for Missing & Exploited Children. 2021. CyberTipline - By the Numbers. https://www.missingkids.org/gethelpnow/cybertipline, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  20. David Forsyth. 2021. Apple’s CSAM detection technology. https://www.apple.com/child-safety/pdf/Technical_Assessment_of_CSAM_Detection_David_Forsyth.pdf, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  21. Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In VLDB. 518–529.Google ScholarGoogle Scholar
  22. Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In ICLR.Google ScholarGoogle Scholar
  23. R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality Reduction by Learning an Invariant Mapping. In CVPR, Vol. 2. 1735–1742.Google ScholarGoogle Scholar
  24. Qingying Hao, Licheng Luo, Steve TK Jan, and Gang Wang. 2021. It’s Not What It Looks Like: Manipulating Perceptual Hashing based Applications. In Proceedings of The ACM Conference on Computer and Communications Security (CCS).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam. 2019. Searching for MobileNetV3. In ICCV.Google ScholarGoogle Scholar
  26. Apple Inc.2021. CSAM Detection - Technical Summary. https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf, accessed: Sept. 22, 2021.Google ScholarGoogle Scholar
  27. Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In ACM - STOC. 604–613.Google ScholarGoogle Scholar
  28. Shubham Jain, Ana-Maria Cretu, and Yves-Alexandre de Montjoye. 2021. Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning. CoRR abs/2106.09820(2021).Google ScholarGoogle Scholar
  29. Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. 2011. Novel Dataset for Fine-Grained Image Categorization. In First Workshop on Fine-Grained Visual Categorization, CVPR.Google ScholarGoogle Scholar
  30. Lim Swee Kiat. 2021. apple-neuralhash-attack. https://github.com/greentfrapp/apple-neuralhash-attack, accessed Oct. 12, 2021.Google ScholarGoogle Scholar
  31. Yannic Kilcher. 2021. Neural Hash Collision Creator. https://github.com/yk/neural_hash_collision, accessed Oct. 12, 2021.Google ScholarGoogle Scholar
  32. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.Google ScholarGoogle Scholar
  33. Neal Krawetz. 2021. PhotoDNA and Limitations. https://hackerfactor.com/blog/index.php?/archives/931-PhotoDNA-and-Limitations.html, accessed: Jan. 13, 2022.Google ScholarGoogle Scholar
  34. Sarah Jamie Lewis. 2021. Tweets on NeuralHash. https://twitter.com/SarahJamieLewis/status/1428146453394821125, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  35. Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou. 2015. Deep hashing for compact binary codes learning. In CVPR. 2475–2483.Google ScholarGoogle Scholar
  36. Haomiao Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2016. Deep Supervised Hashing for Fast Image Retrieval. In CVPR. 2064–2072.Google ScholarGoogle Scholar
  37. Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In ICLR.Google ScholarGoogle Scholar
  38. Microsoft. 2015. PhotoDNA. https://www.microsoft.com/en-us/photodna, accessed Oct. 18, 2021.Google ScholarGoogle Scholar
  39. Official Journal of the European Union. 2021. Regulation (EU) 2021/1232. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32021R1232, accessed: Nov. 16, 2021.Google ScholarGoogle Scholar
  40. Nicolas Papernot, Patrick D. McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The Limitations of Deep Learning in Adversarial Settings. In IEEE European Symposium on Security and Privacy (EuroS&P). 372–387.Google ScholarGoogle Scholar
  41. Benny Pinkas. 2021. A Review of the Cryptography Behind the Apple PSI System. https://www.apple.com/child-safety/pdf/Technical_Assessment_of_CSAM_Detection_Benny_Pinkas.pdf, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  42. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. IJCV 115, 3 (2015), 211–252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Edward Snowden. 2021. The All-Seeing ”i”: Apple Just Declared War on Your Privacy. https://edwardsnowden.substack.com/p/all-seeing-i, accessed: Oct. 1, 2021.Google ScholarGoogle Scholar
  44. Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. One Pixel Attack for Fooling Deep Neural Networks. IEEE TEVC 23, 5 (2019), 828–841.Google ScholarGoogle Scholar
  45. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In ICLR.Google ScholarGoogle Scholar
  46. Xunguang Wang, Zheng Zhang, Guangming Lu, and Yong Xu. 2021. Targeted Attack and Defense for Deep Hashing. In ACM SIGIR. 2298–2302.Google ScholarGoogle Scholar
  47. Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, and Guangming Lu. 2021. Prototype-Supervised Adversarial Network for Targeted Attack of Deep Hashing. In CVPR. 16357–16366.Google ScholarGoogle Scholar
  48. Yongwei Wang, Hamid Palangi, Z. Jane Wang, and Haoqian Wang. 2018. RevHashNet: Perceptually de-hashing real-valued image hashes for similarity retrieval. Signal Processing: Image Communication 68 (2018), 68–75. https://doi.org/10.1016/j.image.2018.06.018Google ScholarGoogle ScholarCross RefCross Ref
  49. Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Katie Crampton (WMUK). 2020. Black Lives Matter protest at US Embassy. https://commons.wikimedia.org/wiki/File:Black_Lives_Matter_protest_at_US_Embassy,_London_01.jpg, accessed Sept. 23, 2021; Cropped; License: https://creativecommons.org/licenses/by-sa/4.0/deed.en.Google ScholarGoogle Scholar
  51. Dayan Wu, Zheng Lin, Bo Li, Mingzhen Ye, and Weiping Wang. 2017. Deep Supervised Hashing for Multi-Label and Large-Scale Image Retrieval. In ACM ICMR. 150–158.Google ScholarGoogle Scholar
  52. Yanru Xiao and Cong Wang. 2021. You See What I Want You To See: Exploring Targeted Black-Box Transferability Attack for Hash-Based Image Retrieval Systems. In CVPR. 1934–1943.Google ScholarGoogle Scholar
  53. Erkun Yang, Tongliang Liu, Cheng Deng, and Dacheng Tao. 2020. Adversarial Examples for Hamming Space Search. IEEE Transactions on Cybernetics 50, 4 (2020), 1473–1484.Google ScholarGoogle ScholarCross RefCross Ref
  54. Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 586–595.Google ScholarGoogle ScholarCross RefCross Ref
  55. F. Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In CVPR. 1556–1564.Google ScholarGoogle Scholar
  56. Mingkang Zhu, Tianlong Chen, and Zhangyang Wang. 2021. Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm. In Proceedings of the 38th International Conference on Machine Learning (ICML), Marina Meila and Tong Zhang (Eds.). Vol. 139. 12868–12877.Google ScholarGoogle Scholar

Index Terms

  1. Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
            June 2022
            2351 pages
            ISBN:9781450393522
            DOI:10.1145/3531146

            Copyright © 2022 ACM

            Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 20 June 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format