ABSTRACT
A key goal of the fair-ML community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outcomes such as fairness, justice, and due process. Bedrock concepts in computer science---such as abstraction and modular design---are used to define notions of fairness and discrimination, to produce fairness-aware learning algorithms, and to intervene at different stages of a decision-making pipeline to produce "fair" outcomes. In this paper, however, we contend that these concepts render technical interventions ineffective, inaccurate, and sometimes dangerously misguided when they enter the societal context that surrounds decision-making systems. We outline this mismatch with five "traps" that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science. We draw on studies of sociotechnical systems in Science and Technology Studies to explain why such traps occur and how to avoid them. Finally, we suggest ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions, and by drawing abstraction boundaries to include social actors rather than purely technical ones.
- Mark S. Ackerman. 2000. The intellectual challenge of CSCW: The gap between social requirements and technical feasibility. Human-Computer Interaction 15, 2-3 (2000), 179--203. Google ScholarDigital Library
- Act of Oct. 27, 2010. Pennsylvania Public Law 931, No. 95. Codified at 42 Pa. C.S. §2154.7.Google Scholar
- African American Ministers In Action, et al. 2018. The use of pre-trial "risk assessment" instruments: A shared statement of civil rights concerns. http://civilrightsdocs.info/pdf/criminal-justice/Pretrial-Risk-Assessment-Full.pdf.Google Scholar
- Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. In Proc. of the 35th International Conference on Machine Learning.Google Scholar
- Philip E. Agre. 1997. Toward a critical technical practice: Lessons learned in trying to reform AI. In Social Science, Technical Systems, and Cooperative Work: Beyond the Great Divide, Geoffery C. Bowker, Susan Leigh Star, Les Gasser, and William Turner (Eds.). Erlbaum, 131--157.Google Scholar
- Madeline Akrich. 1992. The de-scription of technological objects. In Shaping Technology/Building Society, Wiebe E Bijker and John Law (Eds.). MIT Press, 205--224.Google Scholar
- Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica (May 23 2016).Google Scholar
- Samuel R. Bagenstos. 2006. The structural turn and the limits of antidiscrimination law. California Law Review 94 (2006), 1--47.Google ScholarCross Ref
- Chelsea Barabas, Karthik Dinakar, Madars Virza, Joichi Ito, and Jonathan Zittrain. 2018. Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment. Proceedings of Machine Learning Research 81, 1--15.Google Scholar
- Stephen R. Barley. 1996. Technology as an occasion for structuring: evidence from observation of CT scanners and the social order of radiology departments. Administrative Science Quarterly 31 (1996), 78--108.Google ScholarCross Ref
- Eric P.S. Baumer and M. Silberman. 2011. When the implication is not to design (technology). In Proc. of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2271--2274. Google ScholarDigital Library
- Emily M. Bender and Batya Friedman. 2018. Data statements for NLP: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics (to appear) (2018). https://openreview.net/forum?id=By4oPeX9fGoogle Scholar
- Michel Callon. 1986. Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St. Brieuc Bay. In Power, Action and Belief A New Sociology of Knowledge, John Law (Ed.). Routeledge and Kegan Paul, 196--233.Google Scholar
- Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5.2 (2017), 153--163.Google Scholar
- Angèle Christin. 2017. Algorithms in practice: Comparing web journalism and criminal justice. Big Data & Society 4, 2 (2017).Google ScholarCross Ref
- Danielle Keats Citron. 2008. Technological due process. Washington University Law Review 85 (2008), 1249--1313.Google Scholar
- H.M. Collins. 1990. Artificial experts: Social knowledge and intelligent machines (inside technology). MIT Press. Google ScholarDigital Library
- Ruth Schwartz Cowan. 1985. How the refrigerator got its hum. In The Social Shaping of Technology, Mackenzie and Wajcman (Eds.). McGraw Hill Education, 202--218.Google Scholar
- William Dieterich, Christina Mendoza, and Tim Brennan. 2016. COMPAS risk scales: Demonstrating accuracy equity and predictive parity. (2016). Northpoint Inc.Google Scholar
- Paul Dourish. 2006. Implications for design. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 541--550. Google ScholarDigital Library
- Jessica M. Eaglin. 2017. Constructing recidivism risk. Emory Law Journal 67 (2017), 59--122.Google Scholar
- Wendy Nelson Espeland and Michael Sauder. 2007. Rankings and reactivity: How public measures recreate social worlds. Amer. J. Sociology 113, 1 (2007), 1--40.Google ScholarCross Ref
- Virginia Eubanks. 2018. Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin's Press. Google ScholarDigital Library
- Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proc. 21st ACM KDD. 259--268. Google ScholarDigital Library
- Marion Fourcade and Kieran Healy. 2013. Classification situations: Life-chances in the neoliberal era. Accounting, Organizations and Society 38, 8 (2013), 559--72.Google ScholarCross Ref
- Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im) possibility of fairness. Technical Report. arXiv preprint arXiv:1609.07236.Google Scholar
- Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2018. Datasheets for datasets. Technical Report. arXiv preprint arXiv:1803.09010.Google Scholar
- Tarleton Gillespie. 2007. Wired shut: Copyright and the shape of digital culture. MIT Press. Google ScholarDigital Library
- Ben Green. 2018. "Fair" risk assessments: A precarious approach for criminal justice reform. In 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning.Google Scholar
- Josh Greenberg. 2008. From betamax to Blockbuster: Video stores and the invention of movies on video. MIT Press.Google Scholar
- Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P Gummadi, and Adrian Weller. 2018. Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. In Proc. of AAAI.Google Scholar
- Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of opportunity in supervised learning. In NIPS'16 Proceedings of the 30th International Conference on Neural Information Processing Systems. 3315--3323. Google ScholarDigital Library
- Brent Hecht, Lauren Wilcox, Jeffrey P. Bigham, Johannes Schöning, Ehsan Hoque, Jason Ernst, Yonatan Bisk, Luigi De Russis, Lana Yarosh, Bushra Anjum, Danish Contractor, and Cathy Wu. 2018. It's time to do something: Mitigating the negative impacts of computing through a change to the peer review process. https://acm-fca.org/2018/03/29/negativeimpacts/. ACM Future of Computing Blog.Google Scholar
- Deborah Hellman. 2008. When is discrimination wrong? Harvard University Press.Google Scholar
- David J. Hess. 2001. Editor's introduction. In Studying those who study us: An anthropologist in the world of artificial intelligence. Stanford University Press, xi--xxvi. Google ScholarDigital Library
- Anna Lauren Hoffmann. 2019. Where fairness fails: On data, algorithms, and the limits of antidiscrimination discourse. Under review with Information, Communication, and Society (2019).Google Scholar
- Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. 2018. The dataset nutrition label: A framework to drive higher quality data standards. arXiv:1805.03677 {cs} (May 2018). http://arxiv.org/abs/1805.03677 arXiv: 1805.03677.Google Scholar
- Hal Daumé III. 2018. A Course in Machine Learning. http://ciml.info.Google Scholar
- Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In Proc. of the IEEE International Conf. on Computer, Control and Communication.Google ScholarCross Ref
- Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. Machine Learning and Knowledge Discovery in Databases (2012), 35--50.Google Scholar
- Abraham Kaplan. 1964. The conduct of inquiry: Methodology for behavioural science. Chandler.Google Scholar
- Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent trade-offs in the fair determination of risk scores. In Proc. of ITCS.Google Scholar
- Ronald Kline and Trevor Pinch. 1996. Users as agents of technological change: The social construction of the automobile in the rural United States. Technology and culture 37, 4 (1996).Google Scholar
- Rob Kling. 1991. Computerization and social transformations. Science, Technology, & Human Values 16, 3 (1991), 342--367.Google ScholarCross Ref
- Issa Kohler-Hausmann. 2019. Eddie Murphy and the dangers of counterfactual causal thinking about detecting racial discrimination. Northwestern Law Review 113 (2019). Forthcoming.Google Scholar
- Linda Hamilton Krieger. 1995. The content of our categories: A cognitive bias approach to discrimination and equal employment opportunity. Stanford Law Review 47 (1995), 1161--1248.Google ScholarCross Ref
- Wayne R. LaFave. 2017. Criminal law (6th ed.). West Academic Publishing.Google Scholar
- Bruno Latour. 1987. Science in action: How to follow scientists and engineers through society. Harvard University Press.Google Scholar
- Bruno Latour. 2005. Reassembling the social an introduction to actor-network-theory. Oxford University Press.Google Scholar
- John Law. 1987. Technology and heterogeneous engineering: The case of Portuegese expansion. In The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology, Wiebe E. Bijker, Thomas P. Hughes, and Trevor Pinch (Eds.). MIT Press, 111--34.Google Scholar
- Charles R. Lawrence. 1987. The id, the ego, and equal protection: Reckoning with unconscious racism. Stanford Law Review 39 (1987), 317--388.Google ScholarCross Ref
- Lawrence Lessig. 2006. Code 2.0. Basic Books.Google Scholar
- Zachary C Lipton, Alexandra Chouldechova, and Julian McAuley. 2018. Does mitigating ML's impact disparity require treatment disparity? arXiv preprint arXiv:1711.07076 (2018).Google Scholar
- Isak Mendoza and Lee A Bygrave. 2017. The right not to be subject to automated decisions based on profiling. In EU Internet Law. 77--98.Google Scholar
- Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for reporting model preformance. In Proceedings of ACM Conference on Fairness, Accountability and Transparency (FAT*). Google ScholarDigital Library
- Evgeny Morozov. 2013. To save everything, click here: Technology, solutionism, and the urge to fix problems that don't exist. Penguin UK.Google Scholar
- Trevor J. Pinch and Wiebe E. Bijker. 1984. The social construction of facts and artefacts: Or how the sociology of science and the sociology of technology might benefit each other. Social Studies of Science 14, 3 (1984), 399--441.Google ScholarCross Ref
- Dillon Reisman, Jason Schultz, Kate Crawford, and Meredith Whittaker. 2018. Algorithmic impact assessments: A practical framework for public agency accountability. https://ainowinstitute.org/aiareport2018.pdf.Google Scholar
- Salvatore Ruggieri, Dino Pedreschi, and Franco Turini. 2010. Data mining for discrimination discovery. ACM Trans. on Know. Disc. Data (TKDD) 4, 2 (2010), 9. Google ScholarDigital Library
- Andrew D. Selbst. 2017. Disparate impact in big data policing. Georgia Law Review 52 (2017), 109--195.Google Scholar
- Andrew D. Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (2017), 233--242.Google ScholarCross Ref
- Linda J. Skitka, Kathleen L. Mosier, Mark Burdick, and Bonnie Rosenblatt. 2000. Automation bias and errors: Are crews better than individuals? The International Journal of Aviation Psychology 10, 1 (2000), 85--97.Google ScholarCross Ref
- Luke Stark. 2018. Algorithmic psychometrics and the scalable subject. Social Studies of Science 48, 2 (2018), 204--231.Google ScholarCross Ref
- State v. Loomis 2016. 881 N.W.2d 749 (Wisconsin).Google Scholar
- Megan T. Stevenson. 2018. Assessing risk assessment in action. Minnesota Law Review 103 (2018). Forthcoming.Google Scholar
- Lucy Suchman. 1987. Plans and situated actions. Cambridge University Press.Google ScholarDigital Library
- Title VII of the Civil Rights Act of 1964, Public Law 88--352 1964. Codified at 42 U.S.C. § 2000e-2.Google Scholar
- Marie VanNostrand and Christopher T. Lowenkamp. 2013. Assessing pretrial risk without a defendant interview. https:/www.arnoldfoundation.org/wp-content/uploads/2014/02/LJAF_Report_no-interview_FNL.pdf. Laura and John Arnold Foundation.Google Scholar
- Rebecca Wexler. 2018. Life, liberty, and trade secrets: Intellectual properly in the criminal justice system. Stanford Law Review 70 (2018), 1343--1429.Google Scholar
- Benjamin Alan Wiggins. 2013. Managing risk, managing race: racialized actuarial science in the United States, 1881--1948. Ph.D. Dissertation. University of Minnesota.Google Scholar
- Ke Yang, Julia Stoyanovich, Abolfazl Asudeh, Bill Howe, H. V. Jagadish, and Gerome Miklau. 2018. A nutritional label for rankings. Proceedings of 2018 International Conference on Management of Data (SIGMOD'18) (2018), 1773--1776. Google ScholarDigital Library
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness beyond disparate treatment and disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171--1180. Google ScholarDigital Library
- Michael J. Zimmer and Charles A. Sullivan. 2017. Cases and materials on employment discrimination (9th ed.). Wolters Kluwer.Google Scholar
Index Terms
- Fairness and Abstraction in Sociotechnical Systems
Recommendations
Sociotechnical Systems and Ethics in the Large
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and SocietyAdvances in AI techniques and computing platforms have triggered a lively and expanding discourse on ethical decision making by autonomous agents. Much recent work in AI concentrates on the challenges of moral decision making from a decision-theoretic ...
Ethics for Studying Online Sociotechnical Systems in a Big Data World
CSCW'15 Companion: Proceedings of the 18th ACM Conference Companion on Computer Supported Cooperative Work & Social ComputingThe evolution of social technology and research methods present ongoing challenges to studying people online. Recent high-profile cases have prompted discussion among both the research community and the general public about the ethical implications of ...
Un-making artificial moral agents
Floridi and Sanders, seminal work, "On the morality of artificial agents" has catalyzed attention around the moral status of computer systems that perform tasks for humans, effectively acting as "artificial agents." Floridi and Sanders argue that the ...
Comments