Beyond Numbers: How to Properly Evaluate Qualitative UX Research
Crabtree's Framework for Evaluating Human-Centered Research
Picture this: You've spent three weeks conducting qualitative research for a finance app redesign. You carefully recruited 12 participants, conducted in-depth interviews, and identified patterns around financial anxiety and decision paralysis. You're excited to present your findings when the inevitable happens:
"But are these results statistically significant?"
"Just 12 people? How can we make decisions that affect thousands of users based on conversations with just 12 people?"
As UX professionals, we regularly face stakeholders who evaluate our qualitative research using criteria designed for quantitative methods... This misalignment undermines the unique value qualitative research brings to product development.
The Clash of Research Paradigms in UX Practice
In his recent paper "H is for human and how (not) to evaluate qualitative research in HCI", Crabtree (2025) addresses this familiar challenge. Stakeholders often evaluate qualitative research through what he calls a "positivistic" lens that prioritises measurement and metrics (e.g., sample sizes, statistical significance, and generalisability).
This clash stems from two fundamentally different approaches to understanding human behaviour:
Positivism attempts to explain human conduct through causal relationships, using mathematics to find "objective truth" and law-like patterns. This is what stakeholders often expect when they ask for "hard data."
Interpretivism seeks to understand human conduct by interpreting its meaning within cultural and social contexts. It recognises that human behaviour is fundamentally meaningful to the people engaged in it.
Most qualitative UX research is interpretivist by nature. We're not trying to discover universal laws but to understand how people make sense of their interactions with technology, a specific product or service.
Why "How Many Users" Is the Wrong Question
When stakeholders question findings from "just 8 interviews," they misunderstand a fundamental aspect of qualitative research. The value doesn't come from the number of participants but from the depth of understanding derived from those interactions.
Crabtree draws on sociologist Harvey Sacks to explain why even a small qualitative sample provides valuable insights. Individual participants are embedded in culture, and interviews reveal patterns that extend beyond individuals.
For example, when five of your eight finance app participants demonstrate anxiety about linking accounts to a budgeting feature, they're revealing cultural patterns around financial privacy and security. These patterns are valid regardless of whether they apply to 65% or 72% of your user base.
Five Better Ways to Evaluate Qualitative Research
So how should we evaluate qualitative research in UX? Drawing from Crabtree's framework but adapting it for product development we can focus on the following:
1. Methodological Transparency
Rather than asking "how many users did you talk to?", a more appropriate question is "how did you select participants and conduct your research to address potential biases?"
A researcher might explain: "We recruited participants who abandoned the goal-setting process. We conducted interviews in their homes to observe their natural environment and included both novice and experienced users."
This transparency helps stakeholders understand the research context rather than fixating on sample size.
2. Evidence-Based Insights
Crabtree uses the term "apodicity", the self-evident nature of findings when properly supported by data. In practice, this means showing a clear chain from observations to insights.
Compare these statements: "Users find the checkout confusing" versus "Four participants hesitated at the confirmation screen, with one participant stating 'I have no idea if my payment went through' while repeatedly checking for confirmation. Three users attempted to complete the purchase multiple times."
Which one is more convincing? The second statement makes the finding self-evident through specific evidence.
3. Conceptual Understanding
Crabtree also describes "sensitising concepts", analytical frameworks that help us understand patterns in human behaviour. This is perhaps qualitative research's most valuable contribution.
Instead of asking "is this representative of all users?", a better question is "does this help us understand user behaviour in a new or deeper way?"
A study with AI assistant users, for example, might reveal "capability testing" where users deliberately probe the system's boundaries. This concept explains user frustration and trust issues in ways usage statistics cannot.
4. Design Applicability
In UX practice, Crabtree's "analytic reach and utility" translates to whether research can inform design decisions effectively.
Rather than questioning if findings apply to all users, ask whether they provide actionable design direction. For example, in a fintech project understanding that users mentally separate accounts for different purposes directly informs how we design account linking features.
5. Business Relevance
For practical UX work, Crabtree's "relevance to the field" can become relevance to business goals.
Stakeholders should consider whether research addresses important business problems. Understanding why users abandon goal-setting directly impacts conversion, retention, and feature adoption regardless of the exact percentage affected.
Bridging the Gap with Stakeholders
As UX professionals, we can also help stakeholders understand how to appropriately evaluate qualitative insights. Some ways to do this are discussed below:
Different Types of Validity
When stakeholders question whether small-sample research is "valid," they're thinking about statistical validity. Help them understand there are different types:
"We're not claiming statistical validity (that these patterns appear in exactly 72% of users). We're establishing conceptual/construct validity (that these patterns exist and explain user behaviour). Think of it like discovering a new medical condition. The important first step is identifying that the condition exists, not determining exactly how many people have it."
Connecting to Quantitative Data
Position qualitative research as the "why" behind analytics:
"Our analytics show 72% of users abandon at the account linking step. These interviews reveal why: users have concerns about security and mental models about account separation. Understanding these patterns helps us design solutions that address root causes."
The Early Signal Approach
Position qualitative research as an early warning system:
"When 5 out of 8 diverse users struggle with the same issue, it's like smoke indicating fire. We don't need to measure the fire's exact size to know we should address it."
Discovery vs. Validation
Help stakeholders understand different research roles:
"Qualitative research is optimised for discovery — finding patterns and understanding user behaviour. It's not designed for statistical validation. Once we understand these patterns, we can use quantitative methods to test our solutions at scale."
The Power of Combined Approaches
Crabtree isn't arguing against quantitative methods. He's arguing against inappropriately applying quantitative standards to qualitative research. The strongest insights often combine approaches:
Quantitative shows what's happening: 72% abandon the goal-setting flow at account connection.
Qualitative reveals why: Users worry about security, are confused about account selection, and fear they can't reverse connections.
The powerful combination: "Our drop-off problem stems from specific trust concerns and mental model mismatches. By redesigning to address these specific issues, we can reduce the 72% abandonment rate."
A Cultural Shift in Research Evaluation
Perhaps Crabtree's most important insight is that the default "scientific" mindset many stakeholders bring to research evaluation isn't actually scientific. It's a philosophical position treating human behaviour as governed by the same laws as natural phenomena.
As UX practitioners, we can help organisations understand that studying humans requires different approaches than studying… atoms. Human behaviour is meaningful, contextual, and interpretive and our research methods must reflect that reality.
This isn't about lowering standards. Good qualitative research is incredibly rigorous requiring careful analysis, thoughtful interpretation, and clear evidence — as someone who started as a quantitative researcher, I can attest to that! However, standards for judging this rigour differ from statistical significance or sample representativeness.
If you're facing resistance to qualitative insights, here are some approaches you can try:
Frame expectations: "Today I'll share insights that reveal thinking patterns and mental models rather than percentages."
Make methodology transparent: Discuss participant selection and analysis methods.
Show, don't just tell: Use rich evidence including quotes, video clips, and observations.
Connect to business outcomes: Link insights directly to business metrics.
Propose testing/validation strategies: "Now that we understand why users abandon, we can design solutions and A/B test them."
Conclusion: Embracing the Human in HCI
The "H" in HCI stands for Human, and understanding humans requires approaches that respect meaning and context. By evaluating qualitative research on its own terms, we leverage its unique insights properly.
Next time someone asks if your findings from eight interviews are "statistically significant," use it as an opportunity to educate them about qualitative value. By advocating for proper evaluation of qualitative research, you're championing a deeper understanding of the humans we design for, ultimately leading to more meaningful, impactful products.
Good insights! But, I have a question that I have been wrestling with and your article is timely. Interpretivism is a subjectivist epistemology that is situated in the social and cultural context of the individual, as you point out.
However, you also say "For example, when five of your eight finance app participants demonstrate anxiety about linking accounts to a budgeting feature, they're revealing cultural patterns around financial privacy and security. These patterns are valid regardless of whether they apply to 65% or 72% of your user base."
It seems to me these 'cultural patterns' are objectivist, not subjectivist, since they represent a predictable, repeatable pattern in the user population (even if we can't quantify it). So, wouldn't that make this way of of performing qualitative research still fall into the positivist paradigm? Where would you draw the line from when qual research is positivist with a small sample size vs truly subjective and interpretivist?