ABSTRACT
Text-based conversational systems, also referred to as chatbots, have grown widely popular. Current natural language understanding technologies are not yet ready to tackle the complexities in conversational interactions. Breakdowns are common, leading to negative user experiences. Guided by communication theories, we explore user preferences for eight repair strategies, including ones that are common in commercially-deployed chatbots (e.g., confirmation, providing options), as well as novel strategies that explain characteristics of the underlying machine learning algorithms. We conducted a scenario-based study to compare repair strategies with Mechanical Turk workers (N=203). We found that providing options and explanations were generally favored, as they manifest initiative from the chatbot and are actionable to recover from breakdowns. Through detailed analysis of participants' responses, we provide a nuanced understanding on the strengths and weaknesses of each repair strategy.
Supplemental Material
- Alan Agresti. 2003. Categorical data analysis. Vol. 482. John Wiley & Sons.Google Scholar
- Applied AI. 2016. Epic Chatbot / Conversational Bot Failures (2018 update). Retrieved Sept 10, 2018 from https://blog.appliedai.com/ chatbot-fail/Google Scholar
- Ahmed Al Maimani and Anne Roudaut. 2017. Frozen suit: designing a changeable stiffness suit and its application to haptic games. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 2440--2448. Google ScholarDigital Library
- Ralph Allan Bradley and Milton E Terry. 1952. Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 3/4 (1952), 324--345.Google ScholarCross Ref
- Susan E Brennan. 1998. The grounding problem in conversations with and through computers. Social and cognitive approaches to interpersonal communication (1998), 201--225.Google Scholar
- Bonnie Brinton, Martin Fujiki, Diane Frome Loeb, and Erika Winkler. 1986. Development of conversational repair strategies in response to requests for clarification. Journal of Speech, Language, and Hearing Research 29, 1 (1986), 75--81.Google ScholarCross Ref
- Janet E Cahn and Susan E Brennan. 1999. A psychological model of grounding and repair in dialog. In Proc. Fall 1999 AAAI Symposium on Psychological Models of Communication in Collaborative Systems.Google Scholar
- Kuan-Ta Chen, Chen-Chi Wu, Yu-Chun Chang, and Chin-Laung Lei. 2009. A crowdsourceable QoE evaluation framework for multimedia content. In Proceedings of the 17th ACM international conference on Multimedia. ACM, 491--500. Google ScholarDigital Library
- Sylvain Choisel and Florian Wickelmaier. 2007. Evaluation of multichannel reproduced sound: Scaling auditory attributes underlying listener preference. The Journal of the Acoustical Society of America 121, 1 (2007), 388--400.Google ScholarCross Ref
- Herbert H Clark, Susan E Brennan, et al. 1991. Grounding in communication. Perspectives on socially shared cognition 13, 1991 (1991), 127--149.Google Scholar
- Duncan Cramer and Dennis Laurence Howitt. 2004. The Sage dictionary of statistics: a practical resource for students in the social sciences. Sage.Google Scholar
- Herbert Aron David. 1963. The method of paired comparisons. Vol. 12. London.Google Scholar
- Satu Elo and Helvi Kyngäs. 2008. The qualitative content analysis process. Journal of advanced nursing 62, 1 (2008), 107--115.Google ScholarCross Ref
- Sara Engelhardt, Emmeli Hansson, and Iolanda Leite. 2017. Better Faulty than Sorry: Investigating Social Recovery Strategies to Minimize the Impact of Failure in Human-Robot Interaction. In 1st Workshop on Conversational Interruptions in Human-Agent Interactions, WCIHAI 2017, Stockholm, Sweden, 27 August 2017, Vol. 1943. CEUR-WS, 19--27.Google Scholar
- Dave Gomboc, Steve Solomon, Mark G Core, H Chad Lane, and Michael Van Lent. 2005. Design recommendations to support automated explanation and tutoring. Proc. of BRIMS (2005).Google Scholar
- David Gunning. 2017. Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web (2017).Google Scholar
- Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proceedings of the SIGCHI conference on Human Factors in Computing CHI 2019, May 4--9, 2019, Glasgow, Scotland Uk Z. Ashktorab et al. Systems. ACM, 159--166. Google ScholarDigital Library
- Mohit Jain, Ramachandra Kota, Pratyush Kumar, and Shwetak N. Patel. 2018. Convey: Exploring the Use of a Context View for Chatbots. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, Article 468, 6 pages. Google ScholarDigital Library
- Mohit Jain, Pratyush Kumar, Ishita Bhansali, Q. Vera Liao, Khai Truong, and Shwetak Patel. 2018. FarmChat: A Conversational Agent to Answer Farmer Queries. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 4, Article 170 (Dec. 2018), 22 pages. Google ScholarDigital Library
- Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N. Patel. 2018. Evaluating and Informing the Design of Chatbots. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS '18). ACM, New York, NY, USA, 895--906. Google ScholarDigital Library
- Lorenz Cuno Klopfenstein, Saverio Delpriori, Silvia Malatini, and Bogliolo. {n. d.}.Google Scholar
- Nancy Larson-Powers and Rose Marie Pangborn. 1978. Paired comparison and time-intensity measurements of the sensory properties of beverages and gelatins containing sucrose or synthetic sweeteners. Journal of Food Science 43, 1 (1978), 41--46.Google ScholarCross Ref
- Min Kyung Lee, Sara Kiesler, and Jodi Forlizzi. 2010. Receptionist or information kiosk: how do people talk with a robot?. In Proceedings of the 2010 ACM conference on Computer supported cooperative work. ACM, 31--40. Google ScholarDigital Library
- Min Kyung Lee, Sara Kiesler, Jodi Forlizzi, Siddhartha Srinivasa, and Paul Rybski. 2010. Gracefully mitigating breakdowns in robotic services. In Human-Robot Interaction (HRI), 2010 5th ACM/IEEE International Conference on. IEEE, 203--210. Google ScholarDigital Library
- Yeoreum Lee, Jae-eul Bae, Sona S Kwak, and Myung-Suk Kim. 2011. The effect of politeness strategy on human-robot collaborative interaction on malfunction of robot vacuum cleaner. In RSS Workshop on HRI.Google Scholar
- Vera Q. Liao, Matthew Davis, Werner Geyer, Michael Muller, and N. Sadat Shami. 2016. What Can You Do?: Studying Social-Agent Orientation and Agent Proactive Interactions with an Agent for Employees. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems (DIS '16). 264--275. Google ScholarDigital Library
- Vera Q. Liao, Muhammed Masud Hussain, Praveen Chandar, Matthew Davis, Marco Crasso, Dakuo Wang, Michael Muller, Sadat N. Shami, and Werner Geyer. 2018. All Work and no Play? Conversations with a Question-and-Answer Chatbot in the Wild. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, 13. Google ScholarDigital Library
- Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA": The Gulf Between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 5286--5297. Google ScholarDigital Library
- Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 6. Google ScholarDigital Library
- Tim Paek and Eric Horvitz. 1999. Uncertainty, utility, and misunderstanding: A decision-theoretic perspective on grounding in conversational systems. In AAAI Fall Symposium on Psychological Models of Communication, North.Google Scholar
- Tim Paek and Eric Horvitz. 2000. Grounding criterion: Toward a formal theory of grounding. Technical Report. MSR Technical Report.Google Scholar
- Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 640. Google ScholarDigital Library
- MIT Technology Review. 2016. 10 Breakthrough Technologies. Retrieved Sept 10, 2018 from https://www.technologyreview.com/lists/ technologies/2016/Google Scholar
- MIT Technology Review. 2016. The Biggest Technology Failures of 2016. Retrieved Sept 10, 2018 from https://www.technologyreview. com/s/603194/the-biggest-technology-failures-of-2016/Google Scholar
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144. Google ScholarDigital Library
- Emanuel A Schegloff, Gail Jefferson, and Harvey Sacks. 1977. The preference for self-correction in the organization of repair in conversation. Language 53, 2 (1977), 361--382.Google ScholarCross Ref
- Marcos Serrano, Anne Roudaut, and Pourang Irani. 2017. Visual composition of graphical elements on non-rectangular displays. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 4405--4416. Google ScholarDigital Library
- Ben Shneiderman. 2010. Designing the user interface: strategies for effective human-computer interaction. Pearson Education India.Google Scholar
- Ben Shneiderman and Pattie Maes. 1997. Direct manipulation vs. interface agents. interactions 4, 6 (1997), 42--61. Google ScholarDigital Library
- Vasant Srinivasan and Leila Takayama. 2016. Help me please: Robot politeness strategies for soliciting help from humans. In Proceedings of the 2016 CHI conference on human factors in computing systems. ACM, 4945--4955. Google ScholarDigital Library
- Simone Stumpf, Vidya Rajaram, Lida Li, Margaret Burnett, Thomas Dietterich, Erin Sullivan, Russell Drummond, and Jonathan Herlocker. 2007. Toward harnessing user feedback for machine learning. In Proceedings of the 12th international conference on Intelligent user interfaces. ACM, 82--91. Google ScholarDigital Library
- Indrani M Thies, Nandita Menon, Sneha Magapu, Manisha Subramony, and Jacki O'Neill. 2017. How do you want your chatbot? An exploratory Wizard-of-Oz study with young, urban Indians. In Proceedings of the International Conference on Human-Computer Interaction (HCI) (INTERACT '17). IFIP, 20.Google Scholar
- David R Traum. 1999. Computational models of grounding in collaborative systems. In Psychological Models of Communication in Collaborative Systems-Papers from the AAAI Fall Symposium. 124--131.Google Scholar
- Heather Turner, David Firth, et al. 2012. Bradley-Terry models in R: the BradleyTerry2 package. Journal of Statistical Software 48, 9 (2012).Google ScholarCross Ref
- Eric W Weisstein. 2004. Bonferroni correction. (2004).Google Scholar
- Justin D. Weisz, Mohit Jain, Narendra Nath Joshi, James Johnson, and Ingrid Lange. 2019. BigBlueBot: Teaching Strategies for Successful Human-Agent Interactions. In Proceedings of the 2019 ACM International Conference on Intelligent User Interfaces (IUI '19). ACM, New York, NY, USA, 12 pages. Google ScholarDigital Library
- Joseph Weizenbaum. 1966. ELIZA - A computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (1966), 36--45. Google ScholarDigital Library
- Yorick Wilks. 2010. Close Engagements with Artificial Companions: Key Social, Psychological, Ethical, and Design Issues. John Benjamins Publishing Company, Amsterdam.Google Scholar
- Jason D Williams, Nobal B Niraula, Pradeep Dasigi, Aparna Lakshmiratan, Carlos Garcia, Jurado Suarez, Mouni Reddy, and Geoff Zweig. 2015. Rapidly scaling dialog systems with interactive learning. (2015). https://www.microsoft.com/en-us/research/wp-content/ uploads/2016/02/iwsds2015.pdfGoogle Scholar
- Zhou Yu, Leah Nicolich-Henkin, Alan W Black, and Alexander Rudnicky. 2016. A wizard-of-oz study on a non-task-oriented dialog systems that reacts to user engagement. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 55--63.Google ScholarCross Ref
Index Terms
- Resilient Chatbots: Repair Strategy Preferences for Conversational Breakdowns
Recommendations
User Engagement with Chatbots: A Discursive Psychology Approach
CUI '20: Proceedings of the 2nd Conference on Conversational User InterfacesConversational agents have transcended into multiple industries with increased ability for user engagement in intelligent conversation. Conversations with chatbots are different from interpersonal communication in terms of turn-taking, intentions, and ...
Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
CSCWLarge language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal such as collecting self-...
Conversational Agents: Acting on the Wave of Research and Development
CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing SystemsIn the last five years, work on software that interacts with people via typed or spoken natural language, called chatbots, intelligent assistants, social bots, virtual companions, non-human players, and so on, increased dramatically. Chatbots burst into ...
Comments