Yesterday I was trying to brush up my skills in probability and came upon this formula on the Wikipedia page about variance:
The article calls this the Bienaymé formula and gives neither proof nor a link to one. Googling this formula proved equally fruitless in terms of proofs.
So, I set out to find why this works. It took me a few hours of digging through books and removing dust from my University-learned probability skills of 8 years ago, but finally I've made it. Here's how.
Note: the Wikipedia article states the Bienaymé formula for uncorrelated variables. Here I'll prove the case of independent variables, which is a more useful and frequently used application of the formula. I'm also proving it for discrete random variables - the continuous case is equivalent.
Expected value and variance
We'll start with a few definitions. Formally, the expected value of a (discrete) random variable X is defined by:
Where is the PMF of X, . For a function :The variance of X is defined in terms of the expected value as:
From this we can also obtain:
Which is more convenient to use in some calculations.Linear function of a random variable
From the definitions given above it can be easily shown that given a linear function of a random variable: , the expected value and variance of Y are:
For the expected value, we can make a stronger claim for any g(x):
Multiple random variables
When multiple random variables are involved, things start getting a bit more complicated. I'll focus on two random variables here, but this is easily extensible to N variables. Given two random variables that participate in an experiment, their joint PMF is:
The joint PMF determines the probability of any event that can be specified in terms of the random variables X and Y. For example if A is the set of all pairs that have a certain property, then:
Note that from this PMF we can infer the PMF for a single variable, like this:
The expected value for functions of two variables naturally extends and takes the form:
Sum of random variables
Let's see how the sum of random variables behaves. From the previous formula:
But recall equation (1). The above simply equals to:
We'll also want to prove that . This is only true for independent X and Y, so we'll have to make this assumption (assuming that they're independent means that ).
By independence:
A very similar proof can show that for independent X and Y:
For any functions g and h (because if X and Y are independent, so are g(X) and h(y)). Now, at last, we're ready to tackle the variance of X + Y. We start by expanding the definition of variance:
By (2):Now, note that the random variables and are independent, so:
But using (2) again: is obviously just , therefore the above reduces to 0.So, coming back to the long expression for the variance of sums, the last term is 0, and we have:
As I've mentioned before, proving this for the sum of two variables suffices, because the proof for N variables is a simple mathematical extension, and can be intuitively understood by means of a "mental induction". Therefore:
For N independent variables .