Usability, Customer Experience & Statistics

Using Tree-Testing To Test Information Architecture

Jeff Sauro • March 26, 2013

Tree-testing is a lesser known UX method but can substantially help with improving problems in navigation.

There are several software packages to allow you to conduct card sorting quickly and remotely, including solutions form UserZoom and OptimalWorkshop.

Like the other popular method for testing IA, Card Sorting, we'll cover tree-testing at the Denver UX boot camp. Here are several questions to get you thinking about using the method that I covered during a recent webinar.

When would you use a tree test?

Tree testing is sometimes referred to as reverse card sorting since you are finding items instead of placing them into a navigation structure (often called taxonomy).  A tree test is like a usability test on the skeleton of your navigation with the design "skin" removed. It allows you to isolate problems in findability in your taxonomy, groups or labels that are not attributable to issues with design distractions, or helpers.

Tree tests also remove search from the equation as a substantial portion of users will use search while looking for information on a website. While a great search engine and search results page are essential for helping findability, so is navigation. You'll want to isolate the causes of navigation problems and improve them so that when users browse, they find what they're looking for.
Tree tests are ideally run to:
  1. Set a baseline "findability" measure before changing the navigation. This will reveal what items, groups or labels could use improvement (and possibly a new card sort).

  2. Validate a change: Once you've made a change, or if you are considering a change in your IA, run the tree test again with the same (or largely the same) items you used in the baseline study. This helps tell you quantitatively if you've improved findability, kept it the same, or just introduced new problems.

Finally, we have found that tree testing, while similar to card sorting, does generate different findings. For example, we found that difficulty sorting an item only explained 16% of difficulty finding the item--an overlap but not redundancy.

How do you select which items to test in a tree test?

If you have a manageable navigation with a few dozen to a hundred items, you can include all of them in the study. For large websites with thousands of items, this can get unwieldy fast. You may find that paring it down to a few hundred items is sufficient if you eliminate some less used paths for the testing.

When it comes to selecting items for testing in the structure, we like to work with items that either cross departments, come from a top-task study, or are items that had problems in an open card sort.

How many participants do you suggest for a tree test?

The sample size question initially comes down to the outcome metric. Because a tree test is basically a mini-usability test, we can use the same metrics in a usability test along with the same procedure to identify sample sizes. In general, the key metric will be whether the user successfully located an item, which is a binary measure like task completion ("found/didn't find" coded as 1 and 0 respectively). 

The table below shows the sample size you will need to achieve 95% confidence around the findability rates. For example, at a sample size of 93, if 50% of the users locate an item, you'll be 95% confident that between 40% and 60% of all users would find the item given the same tree test. You would need to quadruple your sample size (381) to cut your margin of error in half (5%).

Sample SizeMargin of Error (+/-)
1027%
2120%
3017%
3915%
5313%
9310%
1159%
1478%
1937%
2636%
3815%
5974%
10643%
Table 1: Sample size for proportions used to assess findability in tree testing (95% confidence and assumes percentages of 50%).

How many tree test tasks should each participant be asked to perform?

Again, this depends on the complexity of the navigation and difficulty of the items. We typically see around 1-2 minutes per item. We ran a tree test with 14 items which took a median time of 17 minutes, and another study with 30 items it took 53 minutes. This also includes the time for users to answer two post item questions (confidence and difficulty).

Do you have any strategies for incorporating follow up survey questions with tree tests? How do these help to supplement the tree test results?

We ask participants the Single Ease Question (SEQ), which is a standardized measure to assess task difficulty. Because so much of task usability is simply finding the item, we find the percentile ranks offer a good guide as to the usability. An average score is fluctuates between a 4.8 and 5.1 across hundreds of tasks.

We also ask how confident users were and then associate confidence and completion to generate item "disasters." The graph below shows the four-block for confidence and completion (correctness).  You want as many items in the upper-right as possible.

 
Figure 1: Crossing confidence with success rate (correct) provides and additional perspective on items users might think they are finding correctly but are not (called disasters).

How do task success rates on a tree test compare with success rates on a live site?

This is a great open-ended research question that we are currently exploring.  We've examined task-completion rates across dozens of usability studies and found the average completion rate is around 78%. We expect the tree test average to be lower than this for at least two reasons.
  1. There is no search to help users find the items.
  2. There are no design elements to help guide users or emphasize more popular choices and increase the "information scent."
When examining a much smaller sample of just 77 tree test tasks from 200 users  across three studies, we found the average completion rate was 66%. Consider this result tentative as we continue to collect more data.

However, we (along with Nate Colker from UserZoom) have started running some experiments in which we randomly assign users to find an item in a tree or the live website.  Preliminary results from two websites and tree tests (Target and Ikea) show an opposite pattern of what we were expecting. Of the 20 tasks, 17 had higher task completion rates on the tree test! This would suggest that the design elements and possible poor search results may actually hurt the findability more than help it. 

Or, there could be a methodological difference in how we assess a successfully found item. Users are finding the right item, but we might not be giving them credit for the correct URL due to possible variations we haven't accounted for. More data is needed to confirm these findings. In the interim, it's always good practice to use the same method (tree test or live site) when you run the test again after making changes.

About Jeff Sauro

Jeff Sauro is the founding principal of MeasuringU, a company providing statistics and usability consulting to Fortune 1000 companies.
He is the author of over 20 journal articles and 5 books on statistics and the user-experience.
More about Jeff...


Learn More

UX Bootcamps: Rome: June 20-22, 2016 and Denver: Aug 17-21, 2016
Quantifying the User Experience Half Day Seminar: London: June 15th, 2016 and Chicago: July 15th, 2016


You Might Also Be Interested In:

Related Topics

Tree Testing, Card Sorting, Information Architecture
.

Posted Comments

There are 1 Comments

February 27, 2014 | Sol wrote:

I have been running a dozen of tree test and I always have the same question: which is a good task success rate?rnFor example, a overall success of 70% is a good number or not? At which percentile should stay a FAQ of an e-commerce site? 


Post a Comment

Comment:


Your Name:


Your Email Address:


.

To prevent comment spam, please answer the following :
What is 3 + 2: (enter the number)

Newsletter Sign Up

Receive bi-weekly updates.
[6047 Subscribers]

Connect With Us

UX Bootcamp

Rome:June 20-22 & Denver: Aug. 17-19 2016

3 Days of Hands-On Training on UX Methods, Metrics and Analysis.Learn More

Our Supporters

Usertesting.com

Use Card Sorting to improve your IA

Userzoom: Unmoderated Usability Testing, Tools and Analysis

Loop11 Online Usabilty Testing

.

Jeff's Books

Customer Analytics for DummiesCustomer Analytics for Dummies

A guidebook for measuring the customer experience

Buy on Amazon

Quantifying the User Experience: Practical Statistics for User ResearchQuantifying the User Experience: Practical Statistics for User Research

The most comprehensive statistical resource for UX Professionals

Buy on Amazon

Excel & R Companion to Quantifying the User ExperienceExcel & R Companion to Quantifying the User Experience

Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R

Buy on Amazon | Download

A Practical Guide to the System Usability ScaleA Practical Guide to the System Usability Scale

Background, Benchmarks & Best Practices for the most popular usability questionnaire

Buy on Amazon | Download

A Practical Guide to Measuring UsabilityA Practical Guide to Measuring Usability

72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software

Buy on Amazon | Download

.
.
.