10 Essentials of Measuring Usability
Jeff Sauro • March 8, 2016
Observing just a few users interact with a product or website can tell you a wealth of information about what's working and not working.
But to loosely quote Lord Kelvin
, when we can measure something and express it in numbers, we understand and manage it better.
Measuring usability allows us to better understand how changes in usability affect customer satisfaction and loyalty
Usability can and should be measured on mobile apps, enterprise accounting software, early stage prototypes, or mature websites.
While devices and users will differ, here are ten core concepts to understand when measuring usability that are likely to remain constant. We'll also cover these at the Rome
and Denver UX Boot Camps
1. There's no magic usability thermometer
While Lord Kelvin provides inspiration for measuring to better understand usability, unfortunately there is no usability thermometer to rely on. You can't measure usability directly; instead you measure the outcomes of good and bad experiences.
2. There is an international standard
Usability is formally defined in a few places, but most prominently in the ISO 9241 pt. 11
definition as a combination of effectiveness, efficiency, and satisfaction in the context of use.
3. Have real users attempt real tasks
There is some value in pulling a random person off the street to tell you what he thinks of a design. It's likely better than nothing if your interface is for general, walk-up use. However, usability is best measured by having people who will actually use the product, app, or website attempt to do tasks they would actually do.
4. Use multiple measures
The best way to measure usability is to use multiple metrics. The metrics should correspond to the ISO definition of usability. These are typically collected at the task and study level. At the task level, collect completion rates
(effectiveness), time on task
(efficiency), and perceived-ease using a questionnaire like the SEQ
(satisfaction). These can be combined into a Single Usability Measure (SUM)
, which is easier to report on. At the study level, use a questionnaire like the SUS
,which have desirable psychometric properties.
5. Remember attitudes and actions
You can think of usability as a combination of attitudes (what people think about an interface) and actions (how people interact with an interface). Both are needed to effectively quantify usability.
6. Attitude does not always equal action
People are notoriously fickle, difficult to measure, and often contradict themselves. While we've found that measures of perceived ease and performance metrics correlate[pdf]
, we do see cases where preference does not equal performance. It's not uncommon to see participants perform worse on a design and yet prefer it when asked. When this happens; we tend to rely on performance because users are rarely tasked with picking among alternatives in real life.
7. Initial judgments of usability are tenuous
While there is evidence that initial judgments about facial expressions
, dating partners[pdf]
, and even students' teacher evaluations
[pdf] are accurate, we've found that the first five seconds judging the usability of a website aren't as reliable. The first click however is reasonably predictive of success
8. Surveys as retrospective measures
It's not always feasible
to conduct a task-based usability study. Using well-calibrated surveys to measure attitudes about usability
can provide at least half the equation. While they aren't terribly helpful in diagnosing problems, they are an inexpensive and efficient way to benchmark attitudes across many products and understand the key-drivers of attitude.
9. Context matters
Usability metrics are not invariant
. They will change depending on the tasks, users, and methods you use. Retrospective surveys tend to generate higher attitude metrics than those collected during a usability test. Use caution when comparing data from surveys or usability tests conducted with different contexts and users.
10. Experience matters
All things being equal, more experienced users
tend to rate things more usable than inexperienced users. You should carefully collect and analyze the experience level of your participants and be sure it's not just prior experience that's driving your decisions.