Jeff Sauro • August 13, 2013

Some people think that if you have a small sample size you can't use statistics.Put simply, this is wrong, but it's a common misconception.

There are appropriate statistical methods to deal with small sample sizes.

Although one researcher's "small" is another's large, when I refer to small sample sizes I mean studies that have typically between 5 and 30 users total—a size very common in usability studies.

But user research isn't the only field that deals with small sample sizes. Studies involving fMRIs, which cost a lot to operate, have limited sample sizes as well[pdf] as do studies using laboratory animals.

While there are equations that allow us to properly handle small "n" studies, it's important to know that there are limitations to these smaller sample studies: you are limited to seeing big differences or big "effects."

To put it another way,

Just as with statistics, just because you don't have a large sample size doesn't mean you cannot use statistics. Again, the key limitation is that you are limited to detecting large differences between designs or measures.

Fortunately, in user-experience research we are often most concerned about these big differences—differences users are likely to notice, such as changes in the navigation structure or the improvement of a search results page.

Here are the procedures which we've tested for common, small-sample user research, and we will cover them all at the UX Boot Camp in Denver next month.

For example, if you wanted to know if users would read a sheet that said "Read this first" when installing a printer, and six out of eight users didn't read the sheet in an installation study, you'd know that at least 40% of all users would likely do this--a substantial proportion.

There are three approaches to computing confidence intervals based on whether your data is binary, task-time or continuous.

Confidence interval around a binary measure

For the best overall average for small sample sizes, we have two recommendations for task-time and completion rates, and a more general recommendation for all sample sizes for rating scales.

We experimented[pdf] with several estimators with small sample sizes and found the LaPlace estimator and the simple proportion (referred to as the Maximum Likelihood Estimator) generally work well for the usability test data we examined. When you want the best estimate, the calculator will generate it based on our findings.

