Variance of the sum of independent random variables

January 7th, 2009 at 10:06 pm

Yesterday I was trying to brush up my skills in probability and came upon this formula on the Wikipedia page about variance:

var(\sum_{i=1}^{n}{X_{i}})=\sum_{i=1}^{n}{var(X_{i})}

The article calls this the Bienaymé formula and gives neither proof nor a link to one. Googling this formula proved equally fruitless in terms of proofs.

So, I set out to find why this works. It took me a few hours of digging through books and removing dust from my University-learned probability skills of 8 years ago, but finally I’ve made it. Here’s how.

Note: the Wikipedia article states the Bienaymé formula for uncorrelated variables. Here I’ll prove the case of independent variables, which is a more useful and frequently used application of the formula. I’m also proving it for discrete random variables – the continuous case is equivalent.

Expected value and variance

We’ll start with a few definitions.

Formally, the expected value of a (discrete) random variable X is defined by:

E[X]=\sum_{x}^{}{xp_{X}(x)}

Where p_{X}(x) is the PMF of X, p_{X}(x)=P(X=x). For a function g(x):

E[g(X)]=\sum_{x}^{}{g(x)p_{X}(x)}

The variance of X is defined in terms of the expected value as:

var(X)=E[(X-E[X])^{2}]

From this we can also obtain:

var(X)=\sum_{x}^{}{(x-E[X])^{2}p_{X}(x)}

=\sum_{x}^{}{x^{2}-2xE[X]+(E[X])^{2})p_{X}(x)}

=\sum_{x}{x^{2}p_{X}(x)-2E[X]\sum_{x}xp_{X}(x)}+(E[X])^{2}\sum_{x}p_{X}(x)

=E[X^{2}]-2(E[X])^{2}+(E[X])^{2}

=E[X^{2}]-(E[X])^{2}

Which is more convenient to use in some calculations.

Linear function of a random variable

From the definitions given above it can be easily shown that given a linear function of a random variable: Y = aX + b, the expected value and variance of Y are:

E[Y]=aE[X]+b

var(Y)=a^{2}var[X]

For the expected value, we can make a stronger claim for any g(x):

E[ag(X)+b]=aE[g(x)]+b

Multiple random variables

When multiple random variables are involved, things start getting a bit more complicated. I’ll focus on two random variables here, but this is easily extensible to N variables. Given two random variables that participate in an experiment, their joint PMF is:

p_{X,Y}(x, y)=P(X=x, Y=y)

The joint PMF determines the probability of any event that can be specified in terms of the random variables X and Y. For example if A is the set of all pairs (x, y) that have a certain property, then:

P((X,Y)\in A) = \sum_{(x,y) \in A}^{}{p_{X,Y}(x, y)}

Note that from this PMF we can infer the PMF for a single variable, like this:

p_{X}(x)=P(X=x)

=\sum_{y}{P(X=x, Y=y)}

=\sum_{y}p_{X,Y}(x,y)\qquad (1)

The expected value for functions of two variables naturally extends and takes the form:

E[g(X,Y)]=\sum_{x}\sum_{y}{g(x,y)p_{X,Y}(x,y)}

Sum of random variables

Let’s see how the sum of random variables behaves. From the previous formula:

E[X+Y]=\sum_{x}\sum_{y}{(X+Y)p_{X,Y}(x,y)}=

=\sum_{x}X\sum_{y}{p_{X,Y}(x,y)}+\sum_{y}Y\sum_{x}{p_{X,Y}(x,y)}

But recall equation (1). The above simply equals to:

\sum_{x}Xp_{X}(x)+\sum_{y}Yp_{Y}(y)=

=E[X]+E[Y]\qquad (2)

We’ll also want to prove that E[XY]=E[X]E[Y]. This is only true for independent X and Y, so we’ll have to make this assumption (assuming that they’re independent means that p_{X,Y}(x,y)=p_{X}(x)p_{Y}(y)).

E[XY]=\sum_{x}\sum_{y}{xyp_{X,Y}(x,y)}

By independence:

E[XY]=\sum_{x}\sum_{y}{xyp_{X}(x)p_{Y}(y)}

=\sum_{x}{xp_{X}(x)}\sum_{y}{yp_{Y}(y)}

=E[X]E[Y]\qquad (3)

A very similar proof can show that for independent X and Y:

E[g(X)h(Y)]=E[g(X)]E[h(Y)]

For any functions g and h (because if X and Y are independent, so are g(X) and h(y)).

Now, at last, we’re ready to tackle the variance of X + Y. We start by expanding the definition of variance:

var(X+Y)=E[(X+Y-E[X+Y])^{2}]

By (2):

=E[(X+Y-E[X]-E[Y])^{2}]

=E[((X-E[X]) + (Y – E[Y]))^{2}]

=E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}]

+2E[(X-E[X])(Y-E[Y])]

Now, note that the random variables X-E[X] and Y-E[Y] are independent, so:

E[(X-E[X])(Y-E[Y])]=E[(X-E[X])]E[(Y-E[Y])]

But using (2) again:

E[X-E[X]]=E[X]-E[E[X]]

E[E[X]] is obviously just E[X], therefore the above reduces to 0.

So, coming back to the long expression for the variance of sums, the last term is 0, and we have:

var(X+Y)=E[(X)-E[X])^{2}]+E[(Y-E[Y])^{2}]

=var(X)+var(Y)

As I’ve mentioned before, proving this for the sum of two variables suffices, because the proof for N variables is a simple mathematical extension, and can be intuitively understood by means of a “mental induction”.

Therefore:

var(\sum_{i=1}^{n}{X_{i}})=\sum_{i=1}^{n}{var(X_{i})}

For N independent variables X_{i}. Q.E.D.

Related posts:

  1. Free and bound variables in Lisp
  2. Weighted random generation in Python
  3. random stuff
  4. Generating random sentences from a context free grammar

5 Responses to “Variance of the sum of independent random variables”

  1. RamonNo Gravatar Says:

    Thank you very much for the proof, I was looking for it.

    Just a little typo, there is a missing ( in the equation after ‘From this we can also obtain’

  2. John BechhoeferNo Gravatar Says:

    Thanks, too. This would be nice to link to the wikipedia variance article, as a reference for the Bienaymé formula.

  3. Mario RodriguezNo Gravatar Says:

    Hello, great job on doing this. I really think a link or part of this should be added to the wikipedia page.

    Just a correction: Under the subtitle “Linear function of a random variable” you state that Var[Y] = a^2 * E[X] which is wrong. I worked it out and it should be Var[Y] = a^2 * Var[X], and confirmed the same result in the wikipedia entrance of Variance.

    I will definitely link this page from the “useful links” of my probabilistic learning class.

  4. elibenNo Gravatar Says:

    @Mario, thanks for the correction. Typo fixed.

  5. R. AlizarinNo Gravatar Says:

    I googled for “variance sum random variable proof” and your page came up as the top result. Thanks for this (: