Variance of the sum of independent random variables
January 7th, 2009 at 10:06 pmYesterday I was trying to brush up my skills in probability and came upon this formula on the Wikipedia page about variance:

The article calls this the Bienaymé formula and gives neither proof nor a link to one. Googling this formula proved equally fruitless in terms of proofs.
So, I set out to find why this works. It took me a few hours of digging through books and removing dust from my University-learned probability skills of 8 years ago, but finally I’ve made it. Here’s how.
Note: the Wikipedia article states the Bienaymé formula for uncorrelated variables. Here I’ll prove the case of independent variables, which is a more useful and frequently used application of the formula. I’m also proving it for discrete random variables - the continuous case is equivalent.
Expected value and variance
We’ll start with a few definitions.
Formally, the expected value of a (discrete) random variable X is defined by:
![E[X]=\sum_{x}^{}{xp_{X}(x)} E[X]=\sum_{x}^{}{xp_{X}(x)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_d0e8d1af0ec471829318c527c6403571.png)
Where
is the PMF of X,
. For a function
:
![E[g(X)]=\sum_{x}^{}{g(x)p_{X}(x)} E[g(X)]=\sum_{x}^{}{g(x)p_{X}(x)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_5fa18cb752ef2e5561f90e2a49ae6e6b.png)
The variance of X is defined in terms of the expected value as:
![var(X)=E[(X-E[X])^{2}] var(X)=E[(X-E[X])^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_1d8977de727e00b8a161d5d01922837a.png)
From this we can also obtain:
![var(X)=\sum_{x}^{}{(x-E[X])^{2}p_{X}(x)} var(X)=\sum_{x}^{}{(x-E[X])^{2}p_{X}(x)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_e46b4da0fbbdfb5ad47e17f850da9d01.png)
![=\sum_{x}^{}{x^{2}-2xE[X]+(E[X])^{2})p_{X}(x)} =\sum_{x}^{}{x^{2}-2xE[X]+(E[X])^{2})p_{X}(x)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_3484f25053addb1a8713ce3dadd32160.png)
![=\sum_{x}{x^{2}p_{X}(x)-2E[X]\sum_{x}xp_{X}(x)}+(E[X])^{2}\sum_{x}p_{X}(x) =\sum_{x}{x^{2}p_{X}(x)-2E[X]\sum_{x}xp_{X}(x)}+(E[X])^{2}\sum_{x}p_{X}(x)](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_7e4b27f42663cc66be6eca54b0cd4219.png)
![=E[X^{2}]-2(E[X])^{2}+(E[X])^{2} =E[X^{2}]-2(E[X])^{2}+(E[X])^{2}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_8006b62c2bb1f4c9fbdf68b77621fcbb.png)
![=E[X^{2}]-(E[X])^{2} =E[X^{2}]-(E[X])^{2}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_c38140843731776a0f0ff773337bb0e7.png)
Which is more convenient to use in some calculations.
Linear function of a random variable
From the definitions given above it can be easily shown that given a linear function of a random variable:
, the expected value and variance of Y are:
![E[Y]=aE[X]+b E[Y]=aE[X]+b](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_a8e413367a5f25e0263b00aa89268cca.png)
![var(Y)=a^{2}E[X] var(Y)=a^{2}E[X]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_d3ea928b3ee901fbc152d160ac202cdb.png)
For the expected value, we can make a stronger claim for any g(x):
![E[ag(X)+b]=aE[g(x)]+b E[ag(X)+b]=aE[g(x)]+b](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_6eb64cdc7fef468f3919d70f3964c7a7.png)
Multiple random variables
When multiple random variables are involved, things start getting a bit more complicated. I’ll focus on two random variables here, but this is easily extensible to N variables. Given two random variables that participate in an experiment, their joint PMF is:

The joint PMF determines the probability of any event that can be specified in terms of the random variables X and Y. For example if A is the set of all pairs
that have a certain property, then:

Note that from this PMF we can infer the PMF for a single variable, like this:



The expected value for functions of two variables naturally extends and takes the form:
![E[g(X,Y)]=\sum_{x}\sum_{y}{g(x,y)p_{X,Y}(x,y)} E[g(X,Y)]=\sum_{x}\sum_{y}{g(x,y)p_{X,Y}(x,y)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_5e78038de6258816237361263d64a47a.png)
Sum of random variables
Let’s see how the sum of random variables behaves. From the previous formula:
![E[X+Y]=\sum_{x}\sum_{y}{(X+Y)p_{X,Y}(x,y)}= E[X+Y]=\sum_{x}\sum_{y}{(X+Y)p_{X,Y}(x,y)}=](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_d10f602e77ecd7856ea78cf07d73eae1.png)

But recall equation (1). The above simply equals to:

![=E[X]+E[Y]\qquad (2) =E[X]+E[Y]\qquad (2)](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_8c72fd7dbf906788ba3ea8ea29ec70e5.png)
We’ll also want to prove that
. This is only true for independent X and Y, so we’ll have to make this assumption (assuming that they’re independent means that
).
![E[XY]=\sum_{x}\sum_{y}{xyp_{X,Y}(x,y)} E[XY]=\sum_{x}\sum_{y}{xyp_{X,Y}(x,y)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_d7092d226f125b4d84ffd61b72aab071.png)
By independence:
![E[XY]=\sum_{x}\sum_{y}{xyp_{X}(x)p_{Y}(y)} E[XY]=\sum_{x}\sum_{y}{xyp_{X}(x)p_{Y}(y)}](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_8fb0e3682b1a3068db44dbd8210ef7a4.png)

![=E[X]E[Y]\qquad (3) =E[X]E[Y]\qquad (3)](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_f315a77543b4c720afab74d240a74f11.png)
A very similar proof can show that for independent X and Y:
![E[g(X)h(Y)]=E[g(X)]E[h(Y)] E[g(X)h(Y)]=E[g(X)]E[h(Y)]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_57a1d091f8b75fbdce0a8345ac7201e9.png)
For any functions g and h (because if X and Y are independent, so are g(X) and h(y)).
Now, at last, we’re ready to tackle the variance of X + Y. We start by expanding the definition of variance:
![var(X+Y)=E[(X+Y-E[X+Y])^{2}] var(X+Y)=E[(X+Y-E[X+Y])^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_dd78a207caff724c2bb9bb8ecb48faef.png)
By (2):
![=E[(X+Y-E[X]-E[Y])^{2}] =E[(X+Y-E[X]-E[Y])^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_cff95a5b3f4e24bb5d1fe3f360bb9152.png)
![=E[((X-E[X]) + (Y - E[Y]))^{2}] =E[((X-E[X]) + (Y - E[Y]))^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_b902e4b094a850965d6dc3e8f2980874.png)
![=E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}] =E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_10037cb2f2e745effe800f0728d9d4d1.png)
![+2E[(X-E[X])(Y-E[Y])] +2E[(X-E[X])(Y-E[Y])]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_2b86c6c44226633e07c67ad109db0b66.png)
Now, note that the random variables
and
are independent, so:
![E[(X-E[X])(Y-E[Y])]=E[(X-E[X])]E[(Y-E[Y])] E[(X-E[X])(Y-E[Y])]=E[(X-E[X])]E[(Y-E[Y])]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_a99ea0e7262d89ca6f63ed7198cf448b.png)
But using (2) again:
![E[X-E[X]]=E[X]-E[E[X]] E[X-E[X]]=E[X]-E[E[X]]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_dbbba660d82aa61a31a29562d5df30fe.png)
is obviously just
, therefore the above reduces to 0.
So, coming back to the long expression for the variance of sums, the last term is 0, and we have:
![var(X+Y)=E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}] var(X+Y)=E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}]](http://eli.thegreenplace.net/wp-content/plugins/easy-latex/cache/tex_b341264ae06e2b388816935c5e68ea20.png)

As I’ve mentioned before, proving this for the sum of two variables suffices, because the proof for N variables is a simple mathematical extension, and can be intuitively understood by means of a “mental induction”.
Therefore:

For N independent variables
. 
Related posts:
