Words Matter: How Terminology Affects User Responses to AI System Failures

Managing AI Expectations Through Language

Nov 17, 2023

Trust in automated systems and algorithms has been a major focus within human-computer interaction research. Prior studies have demonstrated that people are prone to "algorithm aversion," losing trust faster in algorithmic systems compared to human advisors after seeing mistakes (Dietvorst et al., 2015). This effect persists even when the algorithm is known to perform better overall than a human (Dietvorst et al., 2018).

However, as artificial intelligence (AI) systems become more advanced and ubiquitous, they may elicit responses more akin to human collaborators. The specific terminology used to describe these systems also appears to be important in shaping user perceptions and expectations — check out my previous article on this. Yet the impact of labelling an AI as such versus using other terms remains under-explored within the context of rebuilding trust after failures. Does the terminology we use to refer to an intelligent system affect the way users react to it after a mistake? Is there a way to minimise “algorithm aversion”?

A new study by Candrian and Scherer investigated how using the label "AI" versus "algorithmic system" affects user trust, satisfaction, and acceptance when errors occur. Their research provides more evidence that terminology significantly impacts attributions and behavioural responses.

Key Findings

Across three online experiments, the researchers tested user reactions to forecasting errors from a system described as either "AI" or "algorithmic."

In the first study, the system predicted student GPAs. The second study involved forecasting stock prices versus cryptocurrency values, to vary task complexity. The third replicated the stock prediction task but also manipulated social accounts, by providing half the participants an explanation for the error.

After identical simulated mistakes, trust declined less and recovered faster when the system was labeled "AI" rather than "algorithmic." Making attributions to lower system control through increased task difficulty or social accounts reduced these differences.

The researchers measured things like how much people trusted the system over multiple interactions, how quickly trust rebuilt if it was lost, and how satisfied people were with the system. They found that when a system was labeled as "AI" rather than just an "algorithm", people thought the AI was more complex, adaptable and less predictable. This meant that trust with the AI system was slower to build initially, trust was lost more severely if the AI made mistakes, and it took longer for the AI to regain lost trust compared to a basic algorithmic system. Overall, people were less satisfied with and felt less stable trust towards the AI system than the algorithmic system.

The researchers used attribution theory to explain this. Attribution theory states that people try to understand why someone or something acts the way it does by attributing causes and motivations. The "AI" label led people to attribute more complex motivations, abilities and unpredictability to the system compared to the more basic "algorithm" label. This resulted in less stable trust in AI.

What Are the Implications?

These findings have important practical and ethical implications for the design of automated systems and communication regarding their capabilities and limitations.

Results suggest that strategic terminology choices significantly impact user trust, satisfaction, and adoption.

Something simple like calling a system "AI" versus "algorithmic" elicits more resilience after failures. Companies need to be careful about managing expectations when marketing AI or algorithmic systems, to avoid overpromising capabilities. Setting proper expectations through transparent language, demonstrating limitations, and not overstating abilities in advertising is crucial — not just for user adoption but also for ethical and responsible AI practices. Overclaiming "AI" abilities can not only mislead people (Bini, 2018) but also have dangerous consequences, as demonstrated in high-profile cases like the fatal accidents involving Tesla's autopilot vehicles or biases in algorithms used for hiring.

Transparency and explainability are also essential, especially when errors occur, in order to maintain user trust. Lack of explainability for why a system made a certain decision or mistake erodes trust over time. Companies should strive to make AI systems more interpretable to users — both in terms of the overall capabilities and why individual errors happened. Even simple explanations can help preserve trust and satisfaction as shown in the study by Candrian & Scherer. However, developing interpretable "glass box" AI methods remains an ongoing research challenge, as many advanced systems rely on complex or "black box" approaches. Finding the right balance is key and making the user aware of the complex nature of the task can also lessen their negative reactions.

For UX professionals in particular, carefully considering the terminology used in interfaces and guides is important, as labels shape user perceptions and satisfaction. We should aim for transparency in explaining system capabilities, incorporate explainability features for decisions and errors, and implement ways for users to monitor performance over time. Advocating for auditing practices during development and conducting user testing is also key. On the research side, evaluating factors that erode trust in AI systems, and developing new methods to increase interpretability should be priorities when building a new system.

These findings further stress the importance of conducting research before deploying AI products.

Testing systems extensively, auditing performance post-deployment, designing in ongoing quality checks, and ability to evaluate error rates are all part of accountable AI system development. These practices can help maintain user trust and satisfaction.

UX Psychology

Discussion about this post