Card sorting: how many participants do you really need?

How many users should you recruit? 5? 15? 20? 30?

Jun 07, 2022

cartoon with text overlay saying "all the card sorting!"

Card sorting is a very popular technique in UX research. According to the Nielsen Norman Group, “card sorting is a UX research method in which study participants group individual labels written on notecards according to criteria that make sense to them. This method uncovers how the target audience’s domain knowledge is structured, and it serves to create an information architecture that matches users’ expectations.”

It was originally developed by psychologists as a method to study how people organise and categorise their knowledge and it was known as “multiple sorting procedure” (MSP). It was mostly used by environmental psychologists1 in the late 70s and the method was very similar to what most of us do now; researchers presented participants with labels representing concepts (either abstract or concrete) on cards and asked them to categorise them into categories that had similarities in some way. Once they finished sorting them, participants were asked to give a name to each category that represented the characteristic the concepts had in common. Initially, card sorting was paper-based but now most researchers administer it on a computer (or online) — no differences have been found between paper and computer-based versions.

There are three main types of card sorting:

Open card sort: In an open card sort, there are no preset categories. Participants are given a list of cards and are asked to create their own categories and label them. Open sorting is a method of generative research and it’s used to uncover the user mental model (e.g., how do users classify concepts and what terms do they use to describe them?).
Closed card sort: In a closed card sort, participants are given a predetermined set of categories that are already labeled and they are asked to place them into those categories. Closed card sorting does not reveal how users conceptualize a set of topics and is considered an evaluative method; it reveals the degree to which the participants agree with the pre-existing categorization.
Hybrid card sort: In a hybrid card sort, participants are given a set of categories but they are also asked to create categories of their own. This method is used when we already have some established categories but want to see whether other categories make more sense to them.

How many participants do you need?

The main quantitative measure we get in a card sorting study is a set of similarity scores, which measures how similar user ratings are for various item pairs. More specifically, if all users sorted two cards into the same category, then the two items represented by the cards would have 100% similarity. If half participants sorted a card into one category, and the other half into another, the cards would have 50% similarity.

Only a few studies have attempted to examine what the ideal sample size for a card sorting study is. In 2004 Tullis and Wood tested 168 users and then simulated the outcome of running card sorting studies with smaller sample sizes (e.g., 10, 20, 30, etc). Then they assessed the outcome of smaller card sorting studies by looking at how well the similarity scores correlate with the scores obtained from testing a large user group. A correlation shows the relationship between two scores and can be between -1 and 1, 1 indicating two datasets are perfectly correlated and 0 that there’s no relationship.

Tullis and Wood found that 20-30 participants are enough for most card sort studies and provide similar results to the ones obtained with hundreds of participants; the correlation between the similarity obtained with 30 participants and the study with 168 was 0.95. Jakob Nielsen looked at the data by Tullis & Wood and suggested that testing 15 users should be enough for most studies as it gives a correlation of 0.90. Doubling the number of participants to obtain a correlation of 0.95 isn’t usually worth the extra cost and the effort requiring to analyse the data.

Correlation coefficients for various sample sizes, with error bars from Tullis & Wood (2004)

A more recent study further supported previous findings showing that in most cases 10-15 participants are good sample size for a card sorting study.

To summarise:

A sample of 15 participants should be sufficient for most projects, especially if you have limited resources.
If you’re working on a project with a bigger splash zone (that could be affecting multiple areas of the business) recruit more users (20-30).

another type of card sorting used by cognitive psychologists in clinical settings is the Wisconsin Card Sorting Task (WCST), which was introduced in 1946.

UX Psychology

Card sorting: how many participants do you really need?

How many users should you recruit? 5? 15? 20? 30?

How many participants do you need?