Suppose you’re building a widget that performs some simple action, which ends in either success or failure. You decide it needs to succeed 75% of the time before you’re willing to release it. You run ten tests, and see that it succeeds exactly 8 times. So you ask yourself, is that really good enough? Do you believe the test results? If you ran the test one more time, and it failed, you would have only an 72.7% success rate, after all.
So when do you have enough data, and how do you make the decision that your success rate is ‘good enough?’ In this post, we’ll look at how the Beta distribution helps us answer this question. First, we’ll get some intuition for the Beta distribution, and then discuss why it’s the right distribution for the problem.
The Basic Problem
Consider the widget’s tests as independent boolean variables, governed by some hidden parameter μ, so that the test succeeds with probability μ. Our job, then, is to estimate this parameter: We want a model for P(μ=x | s, f ), the probability distribution of μ, conditioned on our observations of success and failure. This is a continuous probability distribution, with μ a number between zero and one. (This general setup, by the by, is called ‘parameter estimation‘ in the stats literature, as we’re trying to estimate the parameters of a well-known distribution.)
