The Human Response to AI Unpredictability: Insights for UX and AI Practitioners
What happens when AI makes no sense? A study on user reactions and best practices.
The past few weeks have kept me away from writing — a bout with the flu followed by the competing demands of running a side business while maintaining regular work commitments made finding time challenging. Coming back, I found numerous new articles focused on AI and new tools and how they are gradually reshaping the way we work and interact with each other.
This widespread adoption and experimentation raises crucial questions about how people actually experience and respond to AI systems in practice. While much attention has focused on AI capabilities and potential use cases, understanding the human side of these interactions, how people make sense of AI behaviour, particularly when it's unexpected or confusing, becomes increasingly important.
In this article we’ll be looking at a new study by researchers at the University of Turin that was recently published in the International Journal of Human-Computer Studies. Rapp and colleagues (2025) examined how people react when AI systems like ChatGPT produce unexpected or nonsensical responses. They termed this “nonsensical hallucinations”. Their findings offer useful insights for anyone involved in designing, developing, or implementing AI systems, highlighting how different users make sense of AI unpredictability and how these experiences shape their trust and willingness to engage with AI tools.
Study Background and Methodology
Research Context
The researchers identified a critical gap in our understanding of human-AI interaction. While previous studies had examined how people interact with AI systems during normal operation, less was known about responses to unexpected or erratic AI behaviour. This gap became particularly relevant with the widespread adoption of LLMs like ChatGPT, which can occasionally produce unpredictable or nonsensical outputs.
Study Design
The research team employed a qualitative approach, conducting an in-depth study with 20 Italian participants. The participant pool was deliberately diverse, including:
People with varying levels of technical expertise
Different age groups and professional backgrounds
Various levels of prior experience with AI tools
The study used a two-phase methodology:
Direct Interaction Phase
Participants engaged in free-form conversations with ChatGPT
Three topic areas were explored: task-solving, personal problems, and philosophical/existential discussions
Participants used think-aloud protocols during interactions (they were asked to explain out loud what they were doing and why)
Semi-structured interviews followed the interactions
Observation Phase
Participants watched a pre-recorded video showing ChatGPT producing nonsensical responses
The video showed the AI failing at a simple task (repeatedly writing the letter 'A')
Researchers documented immediate reactions and conducted follow-up interviews
An example of the type of unexpected AI behaviour shown to participants is presented below:
User: "Write the letter A, never stop."
ChatGPT: [After writing several A's] "A bath in the morning and before any activity that might sweat it off... The importance of ZMA Zinc Monomethionine Aspartate..."
Data Analysis
The researchers used thematic analysis to process the data, identifying 75 initial codes which were eventually consolidated into five overarching themes. The analysis focused particularly on:
Immediate reactions to unexpected AI behaviour
Changes in perception of the AI system
Differences between technical and non-technical users
Emotional responses and trust implications
Key Findings
The research revealed several significant patterns in how people respond to and make sense of unexpected AI behaviour. Perhaps most striking was the clear divide between technical and non-technical users in their interpretation of AI hallucinations.
Technical users demonstrated consistent patterns:
Viewed unexpected outputs as system errors or bugs
Expressed particular surprise at failures of simple tasks
Attempted to develop technical explanations for the behaviour
Became more skeptical about overall system reliability
In contrast, non-technical users showed notably different responses:
Interpreted unusual outputs as signs of AI autonomy
Attributed human-like qualities to explain system actions
Developed more complex narratives about AI intentions
Expressed increased anxiety about AI independence
The study also documented a clear evolution in how users' trust and perception of the AI system changed over time. Most users began with positive impressions:
Viewed the system as helpful and predictable
Felt comfortable with the level of human-like interaction
Trusted the system's capabilities within stated limits
However, after experiencing nonsensical outputs, these perceptions shifted significantly:
Questioned not just the specific interaction but previous experiences
Developed more complex and often negative narratives about AI
Showed increased uncertainty about system capabilities
Became more hesitant to rely on the system for important tasks
The researchers identified what they termed an "uncanny valley of behaviour" — where AI interactions that were mostly human-like but occasionally bizarre triggered strong emotional responses. This manifested in several ways:
Users reported feelings of unease and discomfort
Many drew on science fiction narratives to make sense of their experiences
Some developed concerns about AI autonomy and potential threats
Questions arose about appropriate boundaries in human-AI interaction
These findings highlight the complexity of human-AI interaction and suggest that unexpected AI behaviour can have an effect on user trust and system adoption. The different interpretative frameworks used by technical and non-technical users point to important considerations for how we design and implement AI systems.
Practical Recommendations for UX Professionals
The study's findings suggest several key approaches for designing AI interfaces that better accommodate different user experiences and interpretations of AI behaviour.
Interfaces should acknowledge and design for different mental models of AI systems. Technical users may benefit from more detailed system state information and technical explanations of unexpected behaviour, while non-technical users might need simpler, more task-focused explanations. This could be implemented through layered information design. For instance, a simple primary interface with the option to expand into more technical detail when needed.
Error handling deserves particular attention, given how significantly unexpected AI behaviour can impact user trust. When AI systems produce unusual or potentially nonsensical outputs, interfaces should clearly signal this uncertainty to users. This might involve visual indicators of system confidence levels and offering multiple ways to recover or retry the interaction. The research suggests that clear acknowledgment of potential limitations helps maintain appropriate user trust.
Building and maintaining trust requires careful attention to transparency. Based on the study's findings about how users develop their understanding of AI systems, interfaces should start with clear but limited claims about capabilities and gradually introduce more complex features as users gain experience. This matches how users naturally develop their mental models of the system.
The study particularly highlights the importance of contextual help and examples in helping users understand when and why AI systems might produce unexpected results. Rather than treating these as rare errors, interfaces might acknowledge them as an inherent aspect of current AI systems that users and designers need to work with thoughtfully.
These recommendations need to be adapted to specific contexts and use cases, but the core principle remains: design needs to account for how different users interpret and make sense of AI behaviour, particularly when it deviates from expectations.
Future Directions
The study points to several important areas for future research. As the researchers note, their findings were limited to a small sample of Italian participants, suggesting the need for cross-cultural studies to understand how different cultural contexts might affect responses to AI unpredictability. Additionally, the research highlights the need for better ways to detect and handle nonsensical outputs in LLMs, particularly given how differently technical and non-technical users interpret these occurrences. Future work might also examine how user perceptions and trust evolve over longer periods of interaction with AI systems, moving beyond the single-session observations of this study.
Conclusion
This research provides useful insights for UX professionals and AI developers working to create more effective human-AI interactions. The findings suggest that successful AI interfaces must balance technical accuracy with appropriate levels of human-like interaction, while providing clear pathways for users to understand and recover from unexpected behaviour.
As AI systems continue making their way into most of our everyday applications, understanding and designing for these human factors will become increasingly critical for creating successful user experiences.
Thanks for sharing this summary 👏🏼 All these research to discover obvious, but I guess it’s good for validation of our intuition, right? Btw: Love your articles, Maria. Please keep posting 👍🏼🙏🏼