Jeff Sauro • May 30, 2012

There is no usability thermometer to tell you how easy to use a website or software application is.Instead we rely on the outcomes of good and bad experiences which provide evidence for the construct of usability.

Combining multiple usability metrics into a single usability metric (SUM) is something we proposed seven years ago[PDF] and we wrote about in Chapter 9 of Quantifying the User Experience.

Here are 10 things to know about single measures of usability.

- Usability is the intersection of effectiveness, efficiency and satisfaction (ISO 9241 pt 11). One of the best measures of usability is a combination of metrics that describes each of these aspects.

- The most common usability metrics are completion rates and errors (effectiveness), task-times (efficiency) and task-level satisfaction (satisfaction). These metrics tend to have a moderate correlation[PDF] with each other of r = .3 to .5. The correlation is strong enough to suggest an overlap (e.g., users that commit more errors tend to take longer) but the correlation isn't strong enough that one metric can substitute for the other.
- By averaging together a standardized version of completion rates, task-times, task-level satisfaction and errors you generate a Single Usability Metric (SUM) which summarizes the majority of information in all four measures. By averaging you weight each metric equally. Despite many discussions for determining which metric "counts" more, our analysis found that a simple average is least subjective and reflects the data best (from a principal components analysis[PDF]). Keep in mind that if you weight one metric a lot then you must lessen the weight of another, often to a point where an additional metric does little.
- You can have 3 metric or 4 metric versions of SUM: Errors are usually the most time consuming and difficult to collect metric (especially in unmoderated testing) so completion rates, task-times and task-satisfaction provide the minimum description of effectiveness, efficiency and satisfaction for a single usability metric.
- A single usability metric doesn't replace the individual metrics; it simply summarizes them in a more condensed way like an abstract to a long paper or like the mean summarizes a large set of numbers. With any summarization comes data loss, but the gain in interpretability usually far outweighs the loss—especially considering you don't "lose" anything as you can always dive into the individual metrics (like you can read the details of a paper).
- There are a number of reasonable ways to combine usability metrics. One of the best ways we've found is to convert everything into a percentage. For discrete metrics (completion rates and errors) this is done by generating a proportion and for continuous metrics (time and satisfaction) we generate a normalized "z-score" and convert it to percentage then average the metrics together.
- To convert discrete data so they are amenable to combining:
- For completion rates: they are already in the percentage form. An 80% completion rate stays as 80%.
- For errors: you need to convert the raw number of errors into a proportion by identifying the opportunities for errors and subtracting this proportion by 1 (so higher proportions are better). If 10 users commit 20 errors and there are 5 opportunities for an error per task the error rate is 20/50 = .40. Subtracting this value from 1 reverses the error rate so higher percentages are better 1-.4 = 60%.
- To convert continuous data so they are amenable to combining:
- For task-level satisfaction: if you are using the standardized task-level metric like the SEQ you can use the percentile rank. If you have a 5-point or 7-point scale then common specification limits are 4 (for a 5 point scale) and 5 (for a 7 point scale). For example, an average score of a 5.6 on a 7 point scale with a standard deviation of 2 becomes (5.6-5)/2 = .3. The .3 is a z-score and gets converted into a percentage =61.7%.
- For task times you need to identify how long a task should take (a specification limit) and subtract the mean time from the successful task attempts. There's an art to determining[pdf] how long a task should take. For example, if the average time is 50 seconds with a standard deviation of 40 seconds and the spec time is 80 seconds we get a z-score of (50-80)/40 = -.75. The -.75 is a z-score and we convert it to a percentage (which is the area under the curve up to -.75 standard deviations) and we get .2266. We subtract this value from 1 (because we want times to be less than the spec limit) which generates a percentage of 1-.2266 = 77.3%

- A single usability metric is ideal for dashboards, for comparing competing products[pdf] and tasks when you need a single dependent variable to describe the complex construct of usability. Given the four example metrics shown above, we get a SUM of (80%+60%+61.7%+77.3%)/4 = 69.75%.
- You can convert raw usability metrics into a SUM score by using the free downloadable Excel spreadsheet or the usability scorecard application.

6 Ways to Visualize Statistical Significance

Reflecting on the One-Way Mirror

A Brief History of the Magic Number 5 in Usability Testing

97 Things to Know about Usability

Confidence Interval Calculator for a Completion Rate

5 Examples of Quantifying Qualitative Data

How to Conduct a Usability test on a Mobile Device

The Five Most Influential Papers in Usability

10 Things to Know about Usability Problems

Why you only need to test with five users (explained)

Should you use 5 or 7 point scales?

What five users can tell you that 5000 cannot

Nine misconceptions about statistics and usability

Does better usability increase customer loyalty?

.

Customer Analytics for DummiesA guidebook for measuring the customer experience Buy on Amazon | |

Quantifying the User Experience 2nd Ed.: Practical Statistics for User ResearchThe most comprehensive statistical resource for UX Professionals Buy on Amazon | |

Excel & R Companion to the 2nd Ed. of Quantifying the User ExperienceDetailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R Buy on Amazon | Download | |

A Practical Guide to the System Usability ScaleBackground, Benchmarks & Best Practices for the most popular usability questionnaire Buy on Amazon | Download | |

A Practical Guide to Measuring Usability72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software Buy on Amazon | Download |

.

.

.