Jeff Sauro • October 1, 2005
Use this calculator to calculate a confidence interval and best point estimate for an observed completion rate. This calculator provides the Adjusted Wald, Exact, Score and Wald intervals.Adjusted Wald Method
The adjusted Wald interval (also called the modified Wald interval), provides the best coverage for the specified interval when samples are less than about 150. In other words, if you want a 95% confidence interval then this formula will produce an interval that will contain the observed proportion on AVERAGE about 95 percent of the time. It uses the Wald Formula but is "adjusted" in that it adds half of the squared Z-critical value to the numerator and the entire squared critical value to the denominator before computing the interval i.e (x+z^{2}/2)/(n+z^{2}). For example, a 95% confidence level uses the Z-critical value of 1.96 or approximately 2. If you observe 9 out of 10 users completing a task, this formula computes the proportion as( 9 + (1.96^{2}/2) )/ (10 + (1.96^{2})) = approx. 11/14 and builds the interval using the Wald formula. Note: Prior to March 1st 2006, this calculator computed this interval by adding one z-value to the numerator and a squared z-value to the denominator.Exact Method
The Exact method was designed to guarantee at least 95% coverage, whereas the approximate methods (adjusted Wald and Score) provide an average coverage of 95% only in the long run. Use the Exact method when you need to be sure you are calculating a 95% or greater interval - erring on the conservative side. For example, at the population completion rate of 97.8% both the Score and adjusted Wald methods had actual coverage that fell to 89%. When the risk of this level of actual coverage is inappropriate for an application, then the Exact method provides the necessary precision.Score Method
The Score method provided coverage better than the Exact and Wald methods but falls short of the adjusted Wald method. Additionally, its drawback is its computational difficulty and its poor coverage for some values when the population completion rate is around 98% or 2%, regardless of sample size (Agresti and Coull, 1998). The only advantage in using the Score method is that it provides more precise endpoints when the ends of the intervals are close to 0 or 1. For some values (e.g. 9/10) the adjusted Wald's crude intervals go beyond 0 and 1 and a substitution of >.999 is used. For the score method, the upper interval is .9975.Wald Method
The Wald method should be avoided if calculating confidence intervals for completion rates with sample sizes less than 100. Its coverage is too far from the nominal level to provide a reliable estimate of the population completion rate. As the sample size increases above 100, all four methods converge to similar intervals. Use the Wald as a point of reference or for larger sample sizes.When All Users Pass or Fail
With small sample sizes, it is a common occurrence that all users in the sample will complete a task (100% completion rate) or all will fail the task (0% completion rate). For these scenarios, it is often unpalatable to report 100% or 0%. After all, how likely is it that the true population parameter is as extreme as 100% or 0%? The Best Estimate box provides the best point estimate under these conditions and uses the LaPlace method for calculation. While this value may seem too far from the observed 100%, its attractiveness is that it is a function of the sample size-- the greater the sample size, the closer this value will be to 100%.Likely Population Completion Rate
The two options in this drop-down:Point Estimates
Whereas a confidence interval describes a likely range or interval of values, a point estimate describes a single value- a point as an estimate of an unknown parameter in the population. The chance that the sample point estimate is the same as the unknown population completion rate is extremely unlikely. For that reason, you should always compute a confidence interval when reporting a completion rate. It is much more informative than a point estimate since it provides a reasonably likely boundary for the population completion rate.MLE:(Maximum Likelihood Estimate)(x / n)
The MLE is the sample proportion or the number of users succeeding divided by the total attempting. It is the most common point estimate reported.LaPlace (x+1)/(n+2)
A famous large-sample problem comes from the seminal work of Laplace in the early 1800s. He posed the question of how certain you can be that the sun will rise tomorrow, given that you know that it has risen every day for the past 5000 years (1,825,000 days). You can be pretty sure that it will rise, but you can't be absolutely sure. The sun might explode, or a large asteroid might smash the Earth into pieces. In response to this question, he proposed the Laplace Law of Succession, which is to add one to the numerator and two to the denominator ((x+1)/(n+2)). Applying this procedure, you'd be 99.999945% sure that the sun will rise tomorrow - close to 100%, but slightly backed away from that extreme. The magnitude of the adjustment is greater when sample sizes are small. For example, if you observe two out of two successes and apply the LaPlace procedure, then your estimate of p is 75% (x+1=3, n+2=4, p=3/4) rather than 100%. If you had observed two failures, then your estimate of p is 25% (x+1=1, n+2=4, p=1/4) rather than 0%. LaPlace in essence is saying, the next result is a toss up, so give each alternative an equally likely chance of occurring.Wilson (x+z^{2}/2)/(n+z^{2})
Wilson's point estimate is the midpoint of the adjusted wald interval. It is derived by adding half a squared critical value to the numerator and a squared critical value to the denominator. Wilson's is the more conservative approach.Jeffreys (x+.5)/(n+1)
Jeffreys (1961) provided a compromise between the LaPlace and MLE methods. See reference for technical details.Best Estimate
The best point estimate is calculated using the following logic: If "Unknown" is selected from the Likely Population Completion Rate drop-down, the LaPlace method is used. The smaller your sample size and the farther your initial estimate of p is from .5, the greater the benefit over the MLE.If "Between .5 and 1" is selected from the Likely Population Completion Rate drop-down and the observed completion rate is:
References
10 Golden Rules of Facilitation
Does Thinking Aloud Affect Where People Look?
How to Conduct a Usability test on a Mobile Device
A Brief History of the Magic Number 5 in Usability Testing
Should you use 5 or 7 point scales?
Nine misconceptions about statistics and usability
Does better usability increase customer loyalty?
How common are usability problems?
Confidence Interval Calculator for a Completion Rate
What five users can tell you that 5000 cannot
10 Things to Know about Usability Problems
5 Examples of Quantifying Qualitative Data
The Five Most Influential Papers in Usability
Why you only need to test with five users (explained)
Customer Analytics for Dummies A guidebook for measuring the customer experience Buy on Amazon | |
Quantifying the User Experience 2nd Ed.: Practical Statistics for User Research The most comprehensive statistical resource for UX Professionals Buy on Amazon | |
Excel & R Companion to the 2nd Ed. of Quantifying the User Experience Detailed Steps to Solve over 100 Examples and Exercises in the Excel Calculator and R Buy on Amazon | Download | |
A Practical Guide to the System Usability Scale Background, Benchmarks & Best Practices for the most popular usability questionnaire Buy on Amazon | Download | |
A Practical Guide to Measuring Usability 72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software Buy on Amazon | Download |