Tags Math

We’ll be working with the set P_n(\mathbb{R}), real polynomials of degree \leq n. Such polynomials can be expressed using n+1 scalar coefficients a_i as follows:

\[p(x)=a_0+a_1 x + a_2 x^2 + \cdots + a_n x^n\]

Vector space

The set P_n(\mathbb{R}), along with addition of polynomials and scalar multiplication form a vector space. As a proof, let’s review how the vector space axioms are satisfied. We’ll use p(x), q(x) and r(x) as arbitrary polynomials from the set P_n(\mathbb{R}) for the demonstration. Similarly, a and b are arbitrary scalars in \mathbb{R}.

Associativity of vector addition:

\[p(x)+[q(x)+r(x)]=p(x)+q(x)+r(x)=[p(x)+q(x)]+r(x)\]

This is trivial because addition of polynomials is associative [1]. Commutativity is similarly trivial, for the same reason:

Commutativity of vector addition:

\[p(x)+q(x)=q(x)+p(x)\]

Identity element of vector addition:

The zero polynomial 0 serves as an identity element. \forall p(x)\in P_n(\mathbb{R}), we have 0 + p(x) = p(x).

Inverse element of vector addition:

For each p(x), we can use q(x)=-p(x) as the additive inverse, because p(x)+q(x)=0.

Identity element of scalar multiplication

The scalar 1 serves as an identity element for scalar multiplication. For each p(x), it’s true that 1\cdot p(x)=p(x).

Associativity of scalar multiplication:

For any two scalars a and b:

\[a[b\cdot p(x)]=ab\cdot p(x)=[ab]\cdot p(x)\]

Distributivity of scalar multiplication over vector addition:

For any p(x), q(x) and scalar a:

\[a\cdot[p(x)+q(x)]=a\cdot p(x)+a\cdot q(x)\]

Distributivity of scalar multiplication over scalar addition:

For any scalars a and b and polynomial p(x):

\[[a+b]\cdot p(x)=a\cdot p(x) + b\cdot p(x)\]

Linear independence, span and basis

Since we’ve shown that polynomials in P_n(\mathbb{R}) form a vector space, we can now build additional linear algebraic definitions on top of that.

A set of k polynomials p_k(x)\in P_n(\mathbb{R}) is said to be linearly independent if

\[\sum_{i=1}^{k}a_i p_i(x)=0\]

implies a_i=0 \quad \forall i. In words, the only linear combination resulting in the zero vector is when all coefficients are 0.

As an example, let’s discuss the fundamental building blocks of polynomials in P_n(\mathbb{R}): the set \{1, x, x^2, \dots x^n\}. These are linearly independent because:

\[a_0 + a_1 x + a_2 x^2 + \cdots a_n x^n=0\]

is true only for zero polynomial, in which all the coefficients a_i=0. This comes from the very definition of polynomials. Moreover, this set spans the entire P_n(\mathbb{R}) because every polynomial can be (by definition) expressed as a linear combination of \{1, x, x^2, \dots x^n\}.

Since we’ve shown these basic polynomials are linearly independent and span the entire vector space, they are a basis for the space. In fact, this set has a special name: the monomial basis (because a monomial is a polynomial with a single term).

Checking if an arbitrary set of polynomials is a basis

Suppose we have some set polynomials, and we want to know if these form a basis for P_n(\mathbb{R}). How do we go about it?

The idea is using linear algebra the same way we do for any other vector space. Let’s use a concrete example to demonstrate:

\[Q=\{1-x, x, 2x+x^2\}\]

Is the set Q a basis for P_n(\mathbb{R})? We’ll start by checking whether the members of Q are linearly independent. Write:

\[a_0(1-x)+a_1 x + a_2(2x+x^2)=0\]

By regrouping, we can turn this into:

\[a_0 + (a_1-a_0+2a_2)x+a_2 x^2=0\]

For this to be true, the coefficient of each monomial has to be zero; mathematically:

\[\begin{aligned} a_0&=0\\ a_1-a_0+2a_2&=0\\ a2&=0\\ \end{aligned}\]

In matrix form:

\[\begin{bmatrix} 1 & 0 & 0\\ -1 & 1 & 2\\ 0 & 0 & 1\\ \end{bmatrix} \begin{bmatrix}a_0\\ a_1\\ a_2\end{bmatrix} =\begin{bmatrix}0\\ 0\\ 0\end{bmatrix}\]

We know how to solve this, by reducing the matrix into row-echelon form. It’s easy to see that the reduced row-echelon form of this specific matrix is I, the identity matrix. Therefore, this set of equations has a single solution: a_i=0 \quad \forall i [2].

We’ve shown that the set Q is linearly independent. Now let’s show that it spans the space P_n(\mathbb{R}). We want to analyze:

\[a_0(1-x)+a_1 x + a_2(2x+x^2)=\alpha +\beta x + \gamma x^2\]

And find the coefficients a_i that satisfy this for any arbitrary \alpha, \beta and \gamma. We proceed just as before, by regrouping on the left side:

\[a_0 + (a_1-a_0+2a_2)x+a_2 x^2=\alpha +\beta x + \gamma x^2\]

and equating the coefficient of each power of x separately:

\[\begin{aligned} a_0&=\alpha\\ a_1-a_0+2a_2&=\beta\\ a2&=\gamma\\ \end{aligned}\]

If we turn this into matrix form, the matrix of coefficients is exactly the same as before. So we know there’s a single solution, and by rearranging the matrix into I, the solution will appear on the right hand side. It doesn’t matter for the moment what the actual solution is, as long as it exists and is unique. We’ve shown that Q spans the space!

Since the set Q is linearly independent and spans P_n(\mathbb{R}), it is a basis for the space.

Inner product

I’ve discussed inner products for functions in the post about Hilbert space. Well, polynomials are functions, so we can define an inner product using integrals as follows [3]:

\[\langle p, q \rangle = \int_{a}^{b} p(x) q(x) w(x) \, dx\]

Where the bounds a and b are arbitrary, and could be infinite. Whenever we deal with integrals we worry about convergence; in my post on Hilbert spaces, we only talked about L^2 - the square integrable functions. Most polynomials are not square integrable, however. Therefore, we can restrict this using either:

  1. A special weight function w(x) to make sure the inner product integral converges
  2. Set finite bounds on the integral, and then we can just set w(x)=1.

Let’s use the latter, and restrict the bounds into the range [-1,1], setting w(x)=1. We have the following inner product:

\[\langle p, q \rangle = \int_{-1}^{1} p(x) q(x) \, dx\]

Let’s check that this satisfies the inner product space conditions.

Conjugate symmetry:

Since real multiplication is commutative, we can write:

\[\langle p, q \rangle = \int_{-1}^{1} p(x) q(x) \, dx =\int_{-1}^{1} q(x) p(x) \, dx=\langle q, p \rangle\]

We deal in the reals here, so we can safely ignore complex conjugation.

Linearity in the first argument:

Let p_1,p_2,q\in P_n(\mathbb{R}) and a,b\in \mathbb{R}. We want to show that

\[\langle ap_1+bp_2,q \rangle = a\langle p_1,q\rangle +b\langle p_2,q\rangle\]

Expand the left-hand side using our definition of inner product:

\[\begin{aligned} \langle ap_1+bp_2,q \rangle&=\int_{-1}^{1} (a p_1(x)+b p_2(x)) q(x) \, dx\\ &=a\int_{-1}^{1} p_1(x) q(x) \, dx+b\int_{-1}^{1} p_2(x) q(x) \, dx \end{aligned}\]

The result is equivalent to a\langle p_1,q\rangle +b\langle p_2,q\rangle.

Positive-definiteness:

We want to show that for nonzero p\in P_n(\mathbb{R}), we have \langle p, p\rangle > 0. First of all, since p(x)^2\geq0 for all x, it’s true that:

\[\langle p, p\rangle=\int_{-1}^{1}p(x)^2\, dx\geq 0\]

What about the result 0 though? Well, let’s say that

\[\int_{-1}^{1}p(x)^2\, dx=0\]

Since p(x)^2 is a non-negative function, this means that the integral of a non-negative function ends up being 0. But p(x) is a polynomial, so it’s continuous, and so is p(x)^2. If the integral of a continuous non-negative function is 0, it means the function itself is 0. Had it been non-zero in any place, the integral would necessarily have to be positive as well.

We’ve proven that \langle p, p\rangle=0 only when p is the zero polynomial. The positive-definiteness condition is satisfied.

In conclusion, P_n(\mathbb{R}) along with the inner product we’ve defined forms an inner product space.

Orthogonality

Now that we have an inner product, we can define orthogonality on polynomials: two polynomials p,q are orthogonal (w.r.t. our inner product) iff

\[\langle p,q\rangle=\int_{-1}^{1}p(x)q(x)\, dx=0\]

Contrary to expectation [4], the monomial basis polynomials are not orthogonal using our definition of inner product.

For example, calculating the inner product for 1 and x^2:

\[\langle 1,x^2\rangle=\int_{-1}^{1}x^2\, dx=\frac{x^3}{3}\biggr|_{-1}^{1}=\frac{2}{3}\]

There are other sets of polynomials that are orthogonal using our inner product. For example, the Legendre polynomials; but this is a topic for another post.


[1]There’s a level of basic algebra below which we won’t descend in these notes. We could break this statement further down by saying that something like a_i x^i + a_j x^j can be added to b_i x^i + b_j x^j by adding each power of x separately for any i and j, but let’s just take it for granted.
[2]Obviously, this specific set of equations is quite trivial to solve without matrices; I just want to demonstrate the more general approach. Once we have a system of linear equations, the whole toolbox of linear algebra is at our disposal. For example, we could also have checked the determinant and seen it’s non-zero, which means that a square matrix is invertible, and in this case has a single solution of zeroes.
[3]And actually with this (or any valid) inner product, P_n(\mathbb{R}) indeed forms a Hilbert space, because it’s finite-dimensional, and every finite-dimensional inner product space is complete.
[4]Because of how naturally this set spans P_n(\mathbb{R}). And indeed, we can define alternative inner products using which monomials are orthogonal.