<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Eli Bendersky's website - Math</title><link href="https://eli.thegreenplace.net/" rel="alternate"></link><link href="https://eli.thegreenplace.net/feeds/math.atom.xml" rel="self"></link><id>https://eli.thegreenplace.net/</id><updated>2026-03-05T13:31:03-08:00</updated><entry><title>Notes on Lagrange Interpolating Polynomials</title><link href="https://eli.thegreenplace.net/2026/notes-on-lagrange-interpolating-polynomials/" rel="alternate"></link><published>2026-02-28T18:58:00-08:00</published><updated>2026-03-05T13:31:03-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2026-02-28:/2026/notes-on-lagrange-interpolating-polynomials/</id><summary type="html">&lt;p&gt;&lt;em&gt;Polynomial interpolation&lt;/em&gt; is a method of finding a polynomial function
that fits a given set of data perfectly. More concretely, suppose we
have a set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; distinct points &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/e3ee741ff781bf26237c69b505322eb378075e89.svg" style="height: 19px;" type="image/svg+xml"&gt;\[(x_0,y_0), (x_1, y_1), (x_2, y_2)\cdots(x_n, y_n)\]&lt;/object&gt;
&lt;p&gt;And we want to find the polynomial coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/3cb777352d19998f6099864dfe85849e46fd0d8c.svg" style="height: 11px;" type="image/svg+xml"&gt;{a_0\cdots …&lt;/object&gt;&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;em&gt;Polynomial interpolation&lt;/em&gt; is a method of finding a polynomial function
that fits a given set of data perfectly. More concretely, suppose we
have a set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; distinct points &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/e3ee741ff781bf26237c69b505322eb378075e89.svg" style="height: 19px;" type="image/svg+xml"&gt;\[(x_0,y_0), (x_1, y_1), (x_2, y_2)\cdots(x_n, y_n)\]&lt;/object&gt;
&lt;p&gt;And we want to find the polynomial coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/3cb777352d19998f6099864dfe85849e46fd0d8c.svg" style="height: 11px;" type="image/svg+xml"&gt;{a_0\cdots a_n}&lt;/object&gt;
such that:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/4aff81770a1cc10239926be6d6676eb37fba719d.svg" style="height: 22px;" type="image/svg+xml"&gt;\[p(x)=a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n\]&lt;/object&gt;
&lt;p&gt;Fits all our points; that is &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/9bee0b2482233e55d638019ff5324f45ce5c0134.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x_0)=y_0&lt;/object&gt;, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/1066a27080e8f4f98e1e837c563063ce09929300.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x_1)=y_1&lt;/object&gt; etc.&lt;/p&gt;
&lt;p&gt;This post discusses a common approach to solving this problem, and also
shows why such a polynomial exists and is unique.&lt;/p&gt;
&lt;div class="section" id="showing-existence-using-linear-algebra"&gt;
&lt;h2&gt;Showing existence using linear algebra&lt;/h2&gt;
&lt;p&gt;When we assign all points &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/fac14e12be52903f4008e439178230b3eefb437a.svg" style="height: 19px;" type="image/svg+xml"&gt;(x_i, y_i)&lt;/object&gt; into the generic polynomial
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;, we get:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/708bd02c5cc4f2ddb0dbc93967053e55853683d3.svg" style="height: 134px;" type="image/svg+xml"&gt;\[\begin{aligned}
p(x_0)&amp;amp;=a_0 + a_1 x_0 + a_2 x_0^2 + \cdots a_n x_0^n = y_0\\
p(x_1)&amp;amp;=a_0 + a_1 x_1 + a_2 x_1^2 + \cdots a_n x_1^n = y_1\\
p(x_2)&amp;amp;=a_0 + a_1 x_2 + a_2 x_2^2 + \cdots a_n x_2^n = y_2\\
\cdots \\
p(x_n)&amp;amp;=a_0 + a_1 x_n + a_2 x_n^2 + \cdots a_n x_n^n = y_n\\
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;We want to solve for the coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/1ba9b59bdee92f38c1698c784b67ba70f803331d.svg" style="height: 11px;" type="image/svg+xml"&gt;a_i&lt;/object&gt;. This is a linear
system of equations that can be represented by the following matrix
equation:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/4352a3b8e07b33cc419089f1c48e0b360c9200e5.svg" style="height: 159px;" type="image/svg+xml"&gt;\[{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
     1 &amp;amp; x_0 &amp;amp; x_0^2 &amp;amp; \dots &amp;amp; x_0^n\\
     1 &amp;amp; x_1 &amp;amp; x_1^2 &amp;amp; \dots &amp;amp; x_1^n\\
     1 &amp;amp; x_2 &amp;amp; x_2^2 &amp;amp; \dots &amp;amp; x_2^n\\
     \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
     1 &amp;amp; x_n &amp;amp; x_n^2 &amp;amp; \dots &amp;amp; x_n^n
 \end{bmatrix}
 \begin{bmatrix}
     a_0\\
     a_1\\
     a_2\\
     \vdots\\
     a_n\\
 \end{bmatrix}=
 \begin{bmatrix}
     y_0\\
     y_1\\
     y_2\\
     \vdots\\
     y_n\\
 \end{bmatrix}
 }\]&lt;/object&gt;
&lt;p&gt;The matrix on the left is called the &lt;em&gt;Vandermonde matrix&lt;/em&gt;. This matrix
is known to be invertible (see Appendix for a proof); therefore, this
system of equations has a single solution that can be calculated by
inverting the matrix.&lt;/p&gt;
&lt;p&gt;In practice, however, the Vandermonde matrix is often numerically
ill-conditioned, so inverting it isn’t the best way to calculate exact
polynomial coefficients. Several better methods exist.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="lagrange-polynomial"&gt;
&lt;h2&gt;Lagrange Polynomial&lt;/h2&gt;
&lt;p&gt;Lagrange interpolation polynomials emerge from a simple, yet powerful
idea. Let’s define the &lt;em&gt;Lagrange basis&lt;/em&gt; functions &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;
(&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/71047ffb369f30ea7df6aad7d69de2edc1eca912.svg" style="height: 18px;" type="image/svg+xml"&gt;i \in [0, n]&lt;/object&gt;) as follows, given our points &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/fac14e12be52903f4008e439178230b3eefb437a.svg" style="height: 19px;" type="image/svg+xml"&gt;(x_i, y_i)&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/5e4d0981c95e59436de02cb6fc85dcdcf4537f1b.svg" style="height: 54px;" type="image/svg+xml"&gt;\[l_i(x) =
\begin{cases}
    1      &amp;amp; x = x_i \\
    0      &amp;amp; x = x_j \quad \forall j \neq i
\end{cases}\]&lt;/object&gt;
&lt;p&gt;In words, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt; is constrained to 1 at &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; and to 0 at
all other &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/73058e43db0f4edc791b10f27f913cbc5d361ab6.svg" style="height: 14px;" type="image/svg+xml"&gt;x_j&lt;/object&gt;. We don’t care about its value at any other point.&lt;/p&gt;
&lt;p&gt;The linear combination:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9e02d7ad79abe9dd4e108d6e33e02bd6b2cece5e.svg" style="height: 49px;" type="image/svg+xml"&gt;\[p(x)=\sum_{i=0}^{n}y_i l_i(x)\]&lt;/object&gt;
&lt;p&gt;is then a valid interpolating polynomial for our set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt;
points, because it’s equal to &lt;img alt="y_i" class="valign-m4" src="https://eli.thegreenplace.net/images/math/35c2ac2f82d0ff8f9011b596ed7e54bfcc55f471.png" style="height: 12px;" /&gt; at each &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; (take a
moment to convince yourself this is true).&lt;/p&gt;
&lt;p&gt;How do we find &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;? The key insight comes from studying the
following function:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/ab2f106bd0be79644b4f08e7241d53541fde914c.svg" style="height: 54px;" type="image/svg+xml"&gt;\[l&amp;#x27;_i(x)=(x-x_0)\cdot (x-x_1)\cdots (x-x_{i-1}) \cdot (x-x_{i+1})\cdots (x-x_n)=
\prod_{\substack{0\leq j \leq n \\ j \neq i}}(x-x_j)\]&lt;/object&gt;
&lt;p&gt;This function has &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt; terms &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/ee95074369b77b297d18a5d2500b53463aaf275a.svg" style="height: 20px;" type="image/svg+xml"&gt;(x-x_j)&lt;/object&gt; for all
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/6e1be82fe5bc74efdbe2bd9234f5da2cec90f954.svg" style="height: 17px;" type="image/svg+xml"&gt;j\neq i&lt;/object&gt;. It should be easy to see that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/f2445af2b56e69160a5ee45a30d1a96a97ea8496.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_i(x)&lt;/object&gt; is 0 at
all &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/73058e43db0f4edc791b10f27f913cbc5d361ab6.svg" style="height: 14px;" type="image/svg+xml"&gt;x_j&lt;/object&gt; when &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/6e1be82fe5bc74efdbe2bd9234f5da2cec90f954.svg" style="height: 17px;" type="image/svg+xml"&gt;j\neq i&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;What about its value at &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt;, though? We can just assign
&lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; into &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/f2445af2b56e69160a5ee45a30d1a96a97ea8496.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_i(x)&lt;/object&gt; to get:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/c4a71af1b8a616db5347164e5abc488dc888da60.svg" style="height: 54px;" type="image/svg+xml"&gt;\[l&amp;#x27;_i(x_i)=\prod_{\substack{0\leq j \leq n \\ j \neq i}}(x_i-x_j)\]&lt;/object&gt;
&lt;p&gt;And then normalize &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/f2445af2b56e69160a5ee45a30d1a96a97ea8496.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_i(x)&lt;/object&gt;, dividing it by this (constant) value. We get
the Lagrange basis function &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9556fe744ec99f27b47d421edd5e625e072145b5.svg" style="height: 64px;" type="image/svg+xml"&gt;\[l_i(x)=\frac{l&amp;#x27;_i(x)}{l&amp;#x27;_i(x_i)}=\prod_{\substack{0\leq j \leq n \\ j \neq i}}\frac{x-x_j}{x_i-x_j}\]&lt;/object&gt;
&lt;p&gt;Let’s use a concrete example to visualize this. Suppose we have the
following set of points we want to interpolate:
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/eb71287f6f652c4e2753483486714619516e0822.svg" style="height: 19px;" type="image/svg+xml"&gt;(1,4), (2,2), (3,3)&lt;/object&gt;. We can calculate &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/efa94b9705d7324c446e2fd3a0f87163cf4a09aa.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_0(x)&lt;/object&gt;,
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/35fec30e61d70678416c62035d3dce29fd3ffc5a.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_1(x)&lt;/object&gt; and &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/5d4342750343edd5a8607e54a1f6dac771110497.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_2(x)&lt;/object&gt;, and get the following:&lt;/p&gt;
&lt;img alt="Un-normalized lagrange basis functions for our sample" class="align-center" src="https://eli.thegreenplace.net/images/2026/lagrange-basis.png" /&gt;
&lt;p&gt;Note where each &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/f2445af2b56e69160a5ee45a30d1a96a97ea8496.svg" style="height: 19px;" type="image/svg+xml"&gt;l&amp;#x27;_i(x)&lt;/object&gt; intersects the &lt;img alt="x" class="valign-0" src="https://eli.thegreenplace.net/images/math/11f6ad8ec52a2984abaafd7c3b516503785c2072.png" style="height: 8px;" /&gt; axis. These
functions have the right values at all &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/3430b1a7366e85d75a89e8ef95fffaa8e0fd36f2.svg" style="height: 14px;" type="image/svg+xml"&gt;x_{j\neq i}&lt;/object&gt;. If we
normalize them to obtain &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;, we get these functions:&lt;/p&gt;
&lt;img alt="Normalized lagrange basis functions for our sample" class="align-center" src="https://eli.thegreenplace.net/images/2026/lagrange-basis-normalized.png" /&gt;
&lt;p&gt;Note that each polynomial is 1 at the appropriate &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; and 0 at
all the other &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/3430b1a7366e85d75a89e8ef95fffaa8e0fd36f2.svg" style="height: 14px;" type="image/svg+xml"&gt;x_{j\neq i}&lt;/object&gt;, as required.&lt;/p&gt;
&lt;p&gt;With these &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;, we can now plot the interpolating polynomial
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/bc2c6fb1897affc253cf6db77c4f7d4a41a5be32.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)=\sum_{i=0}^{n}y_i l_i(x)&lt;/object&gt;, which fits our set of input points:&lt;/p&gt;
&lt;img alt="Interpolation polynomial" class="align-center" src="https://eli.thegreenplace.net/images/2026/lagrange-inter-poly.png" /&gt;
&lt;/div&gt;
&lt;div class="section" id="polynomial-degree-and-uniqueness"&gt;
&lt;h2&gt;Polynomial degree and uniqueness&lt;/h2&gt;
&lt;p&gt;We’ve just seen that the linear combination of Lagrange basis functions:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9e02d7ad79abe9dd4e108d6e33e02bd6b2cece5e.svg" style="height: 49px;" type="image/svg+xml"&gt;\[p(x)=\sum_{i=0}^{n}y_i l_i(x)\]&lt;/object&gt;
&lt;p&gt;is a valid interpolating polynomial for a set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; distinct
points &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/fac14e12be52903f4008e439178230b3eefb437a.svg" style="height: 19px;" type="image/svg+xml"&gt;(x_i, y_i)&lt;/object&gt;. What is its degree?&lt;/p&gt;
&lt;p&gt;Since the degree of each &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt; is &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt;, then the degree of
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt; is &lt;em&gt;at most&lt;/em&gt; &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt;. We’ve just derived the first part
of the &lt;em&gt;Polynomial interpolation theorem&lt;/em&gt;:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Polynomial interpolation theorem&lt;/strong&gt;: for any &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; data points
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/3e2e9500dcfffb808ec380f67db20f6f0804211e.svg" style="height: 20px;" type="image/svg+xml"&gt;(x_0,y_0), (x_1, y_1)\cdots(x_n, y_n) \in \mathbb{R}^2&lt;/object&gt; where no
two &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/73058e43db0f4edc791b10f27f913cbc5d361ab6.svg" style="height: 14px;" type="image/svg+xml"&gt;x_j&lt;/object&gt; are the same, there exists a unique polynomial
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt; of degree at most &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt; that interpolates these points.&lt;/p&gt;
&lt;p&gt;We’ve demonstrated existence and degree, but not yet &lt;em&gt;uniqueness&lt;/em&gt;. So
let’s turn to that.&lt;/p&gt;
&lt;p&gt;We know that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt; interpolates all &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; points, and its
degree is &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt;. Suppose there’s another such polynomial
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/90425caaec1646540a7a9049146bf2606d9dbd0d.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)&lt;/object&gt;. Let’s construct:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/c77eca4deaf27dd00a5908a71840f50160ff7b4e.svg" style="height: 19px;" type="image/svg+xml"&gt;\[r(x)=p(x)-q(x)\]&lt;/object&gt;
&lt;p&gt;That do we know about &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/9468a2656a6201bfa194ec81fb0f78352c9666c9.svg" style="height: 19px;" type="image/svg+xml"&gt;r(x)&lt;/object&gt;? First of all, its value is 0 at all
our &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt;, so it has &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; &lt;em&gt;roots&lt;/em&gt;. Second, we also know
that its degree is at most &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt; (because it’s the difference of two
polynomials of such degree). These two facts are a contradiction.
No non-zero polynomial of degree &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/189de1175926fa11245f32e4d48aa2a7ab2435b4.svg" style="height: 15px;" type="image/svg+xml"&gt;\leq n&lt;/object&gt; can have
&lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; roots (a basic algebraic fact related to the &lt;em&gt;Fundamental
theorem of algebra&lt;/em&gt;). So &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/9468a2656a6201bfa194ec81fb0f78352c9666c9.svg" style="height: 19px;" type="image/svg+xml"&gt;r(x)&lt;/object&gt; must be the zero polynomial; in
other words, our &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt; is unique &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/4a4e9e431da45a27bc880a8a1ca44d8b1b9bc143.svg" style="height: 12px;" type="image/svg+xml"&gt;\blacksquare&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;Note the implication of uniqueness here: given our set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt;
distinct points, there’s only one polynomial of degree &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/189de1175926fa11245f32e4d48aa2a7ab2435b4.svg" style="height: 15px;" type="image/svg+xml"&gt;\leq n&lt;/object&gt;
that interpolates it. We can find its coefficients by inverting the
Vandermonde matrix, by using Lagrange basis functions, or
any other method &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="lagrange-polynomials-as-a-basis-for-p-n-mathbb-r"&gt;
&lt;h2&gt;Lagrange polynomials as a basis for &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;&lt;/h2&gt;
&lt;p&gt;The set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; consists of all real polynomials of
degree &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/189de1175926fa11245f32e4d48aa2a7ab2435b4.svg" style="height: 15px;" type="image/svg+xml"&gt;\leq n&lt;/object&gt;. This set - along with addition of polynomials and
scalar multiplication - &lt;a class="reference external" href="https://eli.thegreenplace.net/2026/notes-on-linear-algebra-for-polynomials/"&gt;forms a vector
space&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We called &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt; the &amp;quot;Lagrange basis&amp;quot; previously, and they do -
in fact - form an actual linear algebra basis for this vector space. To
prove this claim, we need to show that Lagrange polynomials are linearly
independent and that they span the space.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linear independence&lt;/strong&gt;: we have to show that&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/3469cc3e26dc0483fb111f3993bf06b1c5336c53.svg" style="height: 49px;" type="image/svg+xml"&gt;\[s(x)=\sum_{i=0}^{n}a_i l_i(x)=0\]&lt;/object&gt;
&lt;p&gt;implies &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/936812b8cbbba06fa5948bf8f9393e0bd9abc223.svg" style="height: 15px;" type="image/svg+xml"&gt;a_i=0 \quad \forall i&lt;/object&gt;. Recall that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt; is 1
at &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt;, while all other &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/1695428630aa1a460c0d7eb95749f04017fb8b60.svg" style="height: 20px;" type="image/svg+xml"&gt;l_j(x)&lt;/object&gt; are 0 at that point.
Therefore, evaluating &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/68cee1190d7058555e058756fed1d6527ab89855.svg" style="height: 19px;" type="image/svg+xml"&gt;s(x)&lt;/object&gt; at &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt;, we get:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/eda879b68c484c7e88db094e5d4b63e38738503e.svg" style="height: 19px;" type="image/svg+xml"&gt;\[s(x_i)=a_i = 0\]&lt;/object&gt;
&lt;p&gt;Similarly, we can show that &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/1ba9b59bdee92f38c1698c784b67ba70f803331d.svg" style="height: 11px;" type="image/svg+xml"&gt;a_i&lt;/object&gt; is 0, for all &lt;img alt="i" class="valign-0" src="https://eli.thegreenplace.net/images/math/042dc4512fa3d391c5170cf3aa61e6a638f84342.png" style="height: 12px;" /&gt;
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/4a4e9e431da45a27bc880a8a1ca44d8b1b9bc143.svg" style="height: 12px;" type="image/svg+xml"&gt;\blacksquare&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Span&lt;/strong&gt;: we’ve already demonstrated that the linear combination of
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9e02d7ad79abe9dd4e108d6e33e02bd6b2cece5e.svg" style="height: 49px;" type="image/svg+xml"&gt;\[p(x)=\sum_{i=0}^{n}y_i l_i(x)\]&lt;/object&gt;
&lt;p&gt;is a valid interpolating polynomial for any set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; distinct
points. Using the &lt;em&gt;polynomial interpolation theorem&lt;/em&gt;, this is the unique
polynomial interpolating this set of points. In other words, for every
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/ca8eba083582be9ac48f889214d2f74248955c2f.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)\in P_n(\mathbb{R})&lt;/object&gt;, we can identify any set of &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; distinct points it passes
through, and then use the technique described in this post to find the coefficients of &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/90425caaec1646540a7a9049146bf2606d9dbd0d.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)&lt;/object&gt; in the
Lagrange basis. Therefore, the set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/266681d17d2dc06fe4a139a6c0daa4c5c163b300.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x)&lt;/object&gt; spans
the vector space &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/4a4e9e431da45a27bc880a8a1ca44d8b1b9bc143.svg" style="height: 12px;" type="image/svg+xml"&gt;\blacksquare&lt;/object&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="interpolation-matrix-in-the-lagrange-basis"&gt;
&lt;h2&gt;Interpolation matrix in the Lagrange basis&lt;/h2&gt;
&lt;p&gt;Previously we’ve seen how to use the &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/dd30851f93c58a2397e24da06afb6c865f1fd4d6.svg" style="height: 20px;" type="image/svg+xml"&gt;\{1, x, x^2, \dots x^n\}&lt;/object&gt;
basis to write down a system of linear equations that helps us find the
interpolating polynomial. This results in the &lt;em&gt;Vandermonde matrix&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Using the Lagrange basis, we can get a much nicer matrix representation
of the interpolation equations.&lt;/p&gt;
&lt;p&gt;Recall that our general polynomial using the Lagrange basis is:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/e27577760eb802468c6135d93fa6b761cadd55ea.svg" style="height: 49px;" type="image/svg+xml"&gt;\[p(x)=\sum_{i=0}^{n}a_i l_i(x)\]&lt;/object&gt;
&lt;p&gt;Let’s build a system of equations for each of the &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; points
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/224e78a167ad90c8d6434766a835f152b7adcd44.svg" style="height: 19px;" type="image/svg+xml"&gt;(x_i,y_i)&lt;/object&gt;. For &lt;img alt="x_0" class="valign-m3" src="https://eli.thegreenplace.net/images/math/efbda784ad565c1c5201fdc948a570d0426bc6e6.png" style="height: 11px;" /&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/7a62a190b24f3e630c1d267c707dd96329efa545.svg" style="height: 49px;" type="image/svg+xml"&gt;\[p(x_0)=\sum_{i=0}^{n}a_i l_i(x_0)\]&lt;/object&gt;
&lt;p&gt;By definition of the Lagrange basis functions, all &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/dd57195bcdd0b0c2e0f836ee74730425bfd21726.svg" style="height: 19px;" type="image/svg+xml"&gt;l_i(x_0)&lt;/object&gt;
where &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/5722358d633306016b737900981a909e904098bd.svg" style="height: 17px;" type="image/svg+xml"&gt;i\neq 0&lt;/object&gt; are 0, while &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/e45ce17273faa377fac587c644355c089905f5f8.svg" style="height: 19px;" type="image/svg+xml"&gt;l_0(x_0)&lt;/object&gt; is 1. So this
simplifies to:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/c0536313cb1338dd7071c085f52ea51dcb72d79d.svg" style="height: 19px;" type="image/svg+xml"&gt;\[p(x_0)=a_0\]&lt;/object&gt;
&lt;p&gt;But the value at node &lt;img alt="x_0" class="valign-m3" src="https://eli.thegreenplace.net/images/math/efbda784ad565c1c5201fdc948a570d0426bc6e6.png" style="height: 11px;" /&gt; is &lt;img alt="y_0" class="valign-m4" src="https://eli.thegreenplace.net/images/math/2bb5817d0f3bf8490a8c7b1343f84f9635e683a3.png" style="height: 12px;" /&gt;, so we’ve just found
that &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/6df7c1c7fdab6f76ef16a230d6c0b3744017acfd.svg" style="height: 12px;" type="image/svg+xml"&gt;a_0=y_0&lt;/object&gt;. We can produce similar equations for the other
nodes as well, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7d5f400d37852b1b57adaf0e21efabada38c8363.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x_1)=a_1&lt;/object&gt;, etc. In matrix form:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/d16c76a6af004459db78546a08032c278e9b2a6a.svg" style="height: 159px;" type="image/svg+xml"&gt;\[{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
     1 &amp;amp; 0 &amp;amp; 0 &amp;amp; \dots &amp;amp; 0\\
     0 &amp;amp; 1 &amp;amp; 0 &amp;amp; \dots &amp;amp; 0\\
     0 &amp;amp; 0 &amp;amp; 1 &amp;amp; \dots &amp;amp; 0\\
     \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
     0 &amp;amp; 0 &amp;amp; 0 &amp;amp; \dots &amp;amp; 1
 \end{bmatrix}
 \begin{bmatrix}
     a_0\\
     a_1\\
     a_2\\
     \vdots\\
     a_n\\
 \end{bmatrix}=
 \begin{bmatrix}
     y_0\\
     y_1\\
     y_2\\
     \vdots\\
     y_n\\
 \end{bmatrix}
 }\]&lt;/object&gt;
&lt;p&gt;We get the identity matrix; this is another way to trivially show that
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/6df7c1c7fdab6f76ef16a230d6c0b3744017acfd.svg" style="height: 12px;" type="image/svg+xml"&gt;a_0=y_0&lt;/object&gt;, &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/8491dd4445d9d585827151312ba90e75fc4859fb.svg" style="height: 12px;" type="image/svg+xml"&gt;a_1=y_1&lt;/object&gt; and so on.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="appendix-vandermonde-matrix"&gt;
&lt;h2&gt;Appendix: Vandermonde matrix&lt;/h2&gt;
&lt;p&gt;Given some numbers &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/0f80294ca0e3cea0ce687973ea5aa7202c1c6f44.svg" style="height: 19px;" type="image/svg+xml"&gt;\{x_0 \dots x_n\}&lt;/object&gt; a matrix of this form:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/a1b91b44ffad5324694b37ed4ed07f5de048d55b.svg" style="height: 159px;" type="image/svg+xml"&gt;\[V=
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
1 &amp;amp; x_0 &amp;amp; x_0^2 &amp;amp; \dots &amp;amp; x_0^n\\
1 &amp;amp; x_1 &amp;amp; x_1^2 &amp;amp; \dots &amp;amp; x_1^n\\
1 &amp;amp; x_2 &amp;amp; x_2^2 &amp;amp; \dots &amp;amp; x_2^n\\
\vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
1 &amp;amp; x_n &amp;amp; x_n^2 &amp;amp; \dots &amp;amp; x_n^n
\end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;Is called the &lt;em&gt;Vandermonde&lt;/em&gt; matrix. What’s special about a Vandermonde
matrix is that we know it’s invertible when &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; are distinct.
This is &lt;a class="reference external" href="https://mathworld.wolfram.com/InvertibleMatrixTheorem.html"&gt;because its determinant is known to be
non-zero&lt;/a&gt;.
Moreover, its determinant is &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/aeef2af004a7aeeccdad775b2eaec44543643e90.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\det(V) = \prod_{0 \le i &amp;lt; j \le n} (x_j - x_i)\]&lt;/object&gt;
&lt;p&gt;Here’s why.&lt;/p&gt;
&lt;p&gt;To get some intuition, let’s consider some small-rank Vandermonde
matrices. Starting with a 2-by-2:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/409bf1caaf2ceab8e097d007aacd2e7031185732.svg" style="height: 42px;" type="image/svg+xml"&gt;\[\det(V)=\det\begin{bmatrix}
1 &amp;amp; x_0 \\
1 &amp;amp; x_1 \\
\end{bmatrix}=x_1-x_0\]&lt;/object&gt;
&lt;p&gt;Let’s try 3-by-3 now:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/fa1e9fd7d521588ed1f13cd1ea96c43b56aef604.svg" style="height: 96px;" type="image/svg+xml"&gt;\[\det(V)=\det
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
    1 &amp;amp; x_0 &amp;amp; x_0^2 \\
    1 &amp;amp; x_1 &amp;amp; x_1^2 \\
    1 &amp;amp; x_2 &amp;amp; x_2^2 \\
\end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;We can use the standard way of calculating determinants to expand from
the first row:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/39f0107408e77d8492349b40106253d18487e48e.svg" style="height: 49px;" type="image/svg+xml"&gt;\[\begin{aligned}
\det(V)&amp;amp;=1\cdot(x_1 x_2^2 - x_2 x_1^2)-x_0(x_2^2-x_1^2)+x_0^2(x_2 - x_1)\\
&amp;amp;=x_1 x_2^2 - x_2 x_1^2 - x_0 x_2^2+x_0 x_1^2+x_0^2 x_2 - x_0^2 x_1\\
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;Using some algebraic manipulation, it’s easy to show this is equivalent
to:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/1cfe17009b539affb81c76120ef3960119bf1a8d.svg" style="height: 19px;" type="image/svg+xml"&gt;\[\det(V)=(x_2-x_1)(x_2-x_0)(x_1-x_0)\]&lt;/object&gt;
&lt;p&gt;For the full proof, let’s look at the generalized
&lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt;-by-&lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; matrix again:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/d93183948364818eeb51b48056262a016e345ef9.svg" style="height: 159px;" type="image/svg+xml"&gt;\[V=
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
        1 &amp;amp; x_0 &amp;amp; x_0^2 &amp;amp; \dots &amp;amp; x_0^n\\
        1 &amp;amp; x_1 &amp;amp; x_1^2 &amp;amp; \dots &amp;amp; x_1^n\\
        1 &amp;amp; x_2 &amp;amp; x_2^2 &amp;amp; \dots &amp;amp; x_2^n\\
        \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
        1 &amp;amp; x_n &amp;amp; x_n^2 &amp;amp; \dots &amp;amp; x_n^n
    \end{bmatrix}
 }\]&lt;/object&gt;
&lt;p&gt;Recall that subtracting a multiple of one column from another doesn’t
change a matrix’s determinant. For each column &lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/1ce658ba27a5ceb614d1279b5e24989689505dcd.svg" style="height: 14px;" type="image/svg+xml"&gt;k&amp;gt;1&lt;/object&gt;, we’ll
subtract the value of column &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/4136a771b69c092ff42aa6115afbc9160e5353c9.svg" style="height: 12px;" type="image/svg+xml"&gt;k-1&lt;/object&gt; multiplied by &lt;img alt="x_0" class="valign-m3" src="https://eli.thegreenplace.net/images/math/efbda784ad565c1c5201fdc948a570d0426bc6e6.png" style="height: 11px;" /&gt; from
it (this is done on all columns simultaneously). The idea is to make the
first row all zeros after the very first element:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/7b17cf508ed83e3f00a056891c02d58bc85df72c.svg" style="height: 159px;" type="image/svg+xml"&gt;\[V=
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
        1 &amp;amp; 0 &amp;amp; 0 &amp;amp; \dots &amp;amp; 0\\
        1 &amp;amp; x_1 - x_0 &amp;amp; x_1^2 - x_1 x_0&amp;amp; \dots &amp;amp; x_1^n - x_1^{n-1} x_0\\
        1 &amp;amp; x_2 - x_0 &amp;amp; x_2^2 - x_2 x_0&amp;amp; \dots &amp;amp; x_2^n - x_2^{n-1} x_0\\
        \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
        1 &amp;amp; x_n - x_0 &amp;amp; x_n^2 - x_n x_0&amp;amp; \dots &amp;amp; x_n^n - x_n^{n-1} x_0\\
    \end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;Now we factor out &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/89cce2d73854c97e220aa9b7232bb408e5d1b0d6.svg" style="height: 11px;" type="image/svg+xml"&gt;x_1-x_0&lt;/object&gt; from the second row (after the first
element), &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/9f4f5293a3d0b243932a97c1c52109c3bba378c7.svg" style="height: 11px;" type="image/svg+xml"&gt;x_2-x_0&lt;/object&gt; from the third row and so on, to get:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/0f35e4bc0e43ffcebb6b8ca326dacba9c10549b6.svg" style="height: 159px;" type="image/svg+xml"&gt;\[V=
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
        1 &amp;amp; 0 &amp;amp; 0 &amp;amp; \dots &amp;amp; 0\\
        1 &amp;amp; x_1 - x_0 &amp;amp; x_1(x_1 - x_0)&amp;amp; \dots &amp;amp; x_1^{n-1}(x_1 - x_0)\\
        1 &amp;amp; x_2 - x_0 &amp;amp; x_2(x_2 - x_0)&amp;amp; \dots &amp;amp; x_2^{n-1}(x_2 - x_0)\\
        \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
        1 &amp;amp; x_n - x_0 &amp;amp; x_n(x_n - x_0)&amp;amp; \dots &amp;amp; x_n^{n-1}(x_n - x_0)\\
    \end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;Imagine we erase the first row and first column of &lt;img alt="V" class="valign-0" src="https://eli.thegreenplace.net/images/math/c9ee5681d3c59f7541c27a38b67edf46259e187b.png" style="height: 12px;" /&gt;. We’ll call
the resulting matrix &lt;img alt="W" class="valign-0" src="https://eli.thegreenplace.net/images/math/e2415cb7f63df0c9de23362326ad3c37a9adfc96.png" style="height: 12px;" /&gt;.&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/bd6c93bd0a0454bce81c7565870e59033bffd317.svg" style="height: 128px;" type="image/svg+xml"&gt;\[W=
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
        x_1 - x_0 &amp;amp; x_1(x_1 - x_0)&amp;amp; \dots &amp;amp; x_1^{n-1}(x_1 - x_0)\\
        x_2 - x_0 &amp;amp; x_2(x_2 - x_0)&amp;amp; \dots &amp;amp; x_2^{n-1}(x_2 - x_0)\\
        \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
        x_n - x_0 &amp;amp; x_n(x_n - x_0)&amp;amp; \dots &amp;amp; x_n^{n-1}(x_n - x_0)\\
    \end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;Because the first row of &lt;img alt="V" class="valign-0" src="https://eli.thegreenplace.net/images/math/c9ee5681d3c59f7541c27a38b67edf46259e187b.png" style="height: 12px;" /&gt; is all zeros except the first
element, we have:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/c112a75142ecd4bc97e681cf01e4a734a67e9c5d.svg" style="height: 19px;" type="image/svg+xml"&gt;\[\det(V)=\det(W)\]&lt;/object&gt;
&lt;p&gt;Note that the first row of &lt;img alt="W" class="valign-0" src="https://eli.thegreenplace.net/images/math/e2415cb7f63df0c9de23362326ad3c37a9adfc96.png" style="height: 12px;" /&gt; has a common factor of
&lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/89cce2d73854c97e220aa9b7232bb408e5d1b0d6.svg" style="height: 11px;" type="image/svg+xml"&gt;x_1-x_0&lt;/object&gt;, so when calculating &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/3845b3a938b1cd6b8667c495a8caa9957a8ee224.svg" style="height: 19px;" type="image/svg+xml"&gt;\det(W)&lt;/object&gt;, we can move this
common factor out. Same for the common factor &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/9f4f5293a3d0b243932a97c1c52109c3bba378c7.svg" style="height: 11px;" type="image/svg+xml"&gt;x_2-x_0&lt;/object&gt; of the
second row, and so on. Overall, we can write:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/d55d4c9c9a66ca0f092d2bddf159c61b8db12896.svg" style="height: 128px;" type="image/svg+xml"&gt;\[\det(W)=(x_1-x_0)(x_2-x_0)\cdots(x_n-x_0)\cdot \det
{\renewcommand{\arraystretch}{1.5}\begin{bmatrix}
        1 &amp;amp; x_1 &amp;amp; x_1^2 &amp;amp; \dots &amp;amp; x_1^{n-1}\\
        1 &amp;amp; x_2 &amp;amp; x_2^2 &amp;amp; \dots &amp;amp; x_2^{n-1}\\
        \vdots &amp;amp; \vdots &amp;amp; \vdots &amp;amp; \ddots &amp;amp;\vdots \\
        1 &amp;amp; x_n &amp;amp; x_n^2 &amp;amp; \dots &amp;amp; x_n^{n-1}
    \end{bmatrix}
}\]&lt;/object&gt;
&lt;p&gt;But the smaller matrix is just the Vandermonde matrix for
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/280962e0b640e843643a0554e19a3938aec8cb31.svg" style="height: 19px;" type="image/svg+xml"&gt;\{x_1 \dots x_{n}\}&lt;/object&gt;. If we continue this process by induction,
we’ll get:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/aeef2af004a7aeeccdad775b2eaec44543643e90.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\det(V) = \prod_{0 \le i &amp;lt; j \le n} (x_j - x_i)\]&lt;/object&gt;
&lt;p&gt;If you’re interested, the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Vandermonde_matrix"&gt;Wikipedia page for the Vandermonde matrix&lt;/a&gt; has a couple of additional
proofs.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;The &lt;img alt="x" class="valign-0" src="https://eli.thegreenplace.net/images/math/11f6ad8ec52a2984abaafd7c3b516503785c2072.png" style="height: 8px;" /&gt;-es here are called &lt;em&gt;nodes&lt;/em&gt; and the &lt;img alt="y" class="valign-m4" src="https://eli.thegreenplace.net/images/math/95cb0bfd2977c761298d9624e4b4d4c72a39974a.png" style="height: 12px;" /&gt;-s are
called &lt;em&gt;values&lt;/em&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;a class="reference external" href="https://eli.thegreenplace.net/2024/method-of-differences-and-newton-polynomials/"&gt;Newton
polynomials&lt;/a&gt;
is also an option, and there are many other approaches.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Note that this means the product of all differences between
&lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/73058e43db0f4edc791b10f27f913cbc5d361ab6.svg" style="height: 14px;" type="image/svg+xml"&gt;x_j&lt;/object&gt; and &lt;img alt="x_i" class="valign-m3" src="https://eli.thegreenplace.net/images/math/34e03e6559b14df9fe5a97bbd2ed10109dfebbd3.png" style="height: 11px;" /&gt; where &lt;img alt="i" class="valign-0" src="https://eli.thegreenplace.net/images/math/042dc4512fa3d391c5170cf3aa61e6a638f84342.png" style="height: 12px;" /&gt; is strictly smaller than
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/5c2dd944dde9e08881bef0894fe7b22a5c9c4b06.svg" style="height: 16px;" type="image/svg+xml"&gt;j&lt;/object&gt;. That is, for &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/2091fb295870e9f79b6d8a10d0f6046b091e6fe5.svg" style="height: 12px;" type="image/svg+xml"&gt;n=2&lt;/object&gt;, the full product is
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/c42a249f6023c731a0d670240a66413cd79aee91.svg" style="height: 19px;" type="image/svg+xml"&gt;(x_2-x_1)(x_2-x_0)(x_1-x_0)&lt;/object&gt;. For an arbitrary &lt;img alt="n" class="valign-0" src="https://eli.thegreenplace.net/images/math/d1854cae891ec7b29161ccaf79a24b00c274bdaa.png" style="height: 8px;" /&gt;,
there are &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/66a19249b26d808e85ea349b8b84dee8a2090e0c.svg" style="height: 25px;" type="image/svg+xml"&gt;\frac{n(n-1)}{2}&lt;/object&gt; factors in total.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Math"></category></entry><entry><title>Notes on Linear Algebra for Polynomials</title><link href="https://eli.thegreenplace.net/2026/notes-on-linear-algebra-for-polynomials/" rel="alternate"></link><published>2026-02-25T18:34:00-08:00</published><updated>2026-02-26T02:33:50-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2026-02-25:/2026/notes-on-linear-algebra-for-polynomials/</id><summary type="html">&lt;p&gt;We’ll be working with the set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;, real polynomials
of degree &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/189de1175926fa11245f32e4d48aa2a7ab2435b4.svg" style="height: 15px;" type="image/svg+xml"&gt;\leq n&lt;/object&gt;. Such polynomials can be expressed using
&lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; scalar coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/1ba9b59bdee92f38c1698c784b67ba70f803331d.svg" style="height: 11px;" type="image/svg+xml"&gt;a_i&lt;/object&gt; as follows:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/dedb05af255bc66694c47ba554c5366bdc3b2e12.svg" style="height: 22px;" type="image/svg+xml"&gt;\[p(x)=a_0+a_1 x + a_2 x^2 + \cdots + a_n x^n\]&lt;/object&gt;
&lt;div class="section" id="vector-space"&gt;
&lt;h2&gt;Vector space&lt;/h2&gt;
&lt;p&gt;The set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;, along with …&lt;/p&gt;&lt;/div&gt;</summary><content type="html">&lt;p&gt;We’ll be working with the set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;, real polynomials
of degree &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/189de1175926fa11245f32e4d48aa2a7ab2435b4.svg" style="height: 15px;" type="image/svg+xml"&gt;\leq n&lt;/object&gt;. Such polynomials can be expressed using
&lt;object class="valign-m2" data="https://eli.thegreenplace.net/images/math/db2a943efe93404e43f6ecbec79e0a4fe81b1649.svg" style="height: 14px;" type="image/svg+xml"&gt;n+1&lt;/object&gt; scalar coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/1ba9b59bdee92f38c1698c784b67ba70f803331d.svg" style="height: 11px;" type="image/svg+xml"&gt;a_i&lt;/object&gt; as follows:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/dedb05af255bc66694c47ba554c5366bdc3b2e12.svg" style="height: 22px;" type="image/svg+xml"&gt;\[p(x)=a_0+a_1 x + a_2 x^2 + \cdots + a_n x^n\]&lt;/object&gt;
&lt;div class="section" id="vector-space"&gt;
&lt;h2&gt;Vector space&lt;/h2&gt;
&lt;p&gt;The set &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;, along with addition of polynomials and
scalar multiplication form a &lt;em&gt;vector space&lt;/em&gt;. As a proof, let’s review
how the vector space axioms are satisfied. We’ll use &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;,
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/90425caaec1646540a7a9049146bf2606d9dbd0d.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)&lt;/object&gt; and &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/9468a2656a6201bfa194ec81fb0f78352c9666c9.svg" style="height: 19px;" type="image/svg+xml"&gt;r(x)&lt;/object&gt; as arbitrary polynomials from the set
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; for the demonstration. Similarly, &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt; and
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98.svg" style="height: 13px;" type="image/svg+xml"&gt;b&lt;/object&gt; are arbitrary scalars in &lt;img alt="\mathbb{R}" class="valign-0" src="https://eli.thegreenplace.net/images/math/0ed839b111fe0e3ca2b2f618b940893eaea88a57.png" style="height: 12px;" /&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Associativity of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/45656c48313e3e3566577fdf014d18f31080ed42.svg" style="height: 19px;" type="image/svg+xml"&gt;\[p(x)+[q(x)+r(x)]=p(x)+q(x)+r(x)=[p(x)+q(x)]+r(x)\]&lt;/object&gt;
&lt;p&gt;This is trivial because addition of polynomials is associative  &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;.
Commutativity is similarly trivial, for the same reason:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Commutativity of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/4e72ebe69a19da51a0e7319eaa997c671024d3c6.svg" style="height: 19px;" type="image/svg+xml"&gt;\[p(x)+q(x)=q(x)+p(x)\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Identity element of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;The zero polynomial 0 serves as an identity element.
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/1910380c6f610451862989641479600496131408.svg" style="height: 19px;" type="image/svg+xml"&gt;\forall p(x)\in P_n(\mathbb{R})&lt;/object&gt;, we have
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/d5aec4c833b53c3aee94eed8b9d7c6bd8bb577d9.svg" style="height: 19px;" type="image/svg+xml"&gt;0 + p(x) = p(x)&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inverse element of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For each &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;, we can use &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/eaa3a7a017152661a9e148063a189da51e966773.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)=-p(x)&lt;/object&gt; as the additive
inverse, because &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/a178e59633deb255cbecbdaecb1980ee53ec94c7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)+q(x)=0&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Identity element of scalar multiplication&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The scalar 1 serves as an identity element for scalar multiplication.
For each &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;, it’s true that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/5ba95500b0306ee6283d5d71d1a6aa41c2c0b35b.svg" style="height: 19px;" type="image/svg+xml"&gt;1\cdot p(x)=p(x)&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Associativity of scalar multiplication&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For any two scalars &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt; and &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98.svg" style="height: 13px;" type="image/svg+xml"&gt;b&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/bc050a41a2584e826e96409fb19fda0347fa9b69.svg" style="height: 19px;" type="image/svg+xml"&gt;\[a[b\cdot p(x)]=ab\cdot p(x)=[ab]\cdot p(x)\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Distributivity of scalar multiplication over vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For any &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/90425caaec1646540a7a9049146bf2606d9dbd0d.svg" style="height: 19px;" type="image/svg+xml"&gt;q(x)&lt;/object&gt; and scalar &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/5e0141340db285d60c642b145119b3f279e77518.svg" style="height: 19px;" type="image/svg+xml"&gt;\[a\cdot[p(x)+q(x)]=a\cdot p(x)+a\cdot q(x)\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Distributivity of scalar multiplication over scalar addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For any scalars &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt; and &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98.svg" style="height: 13px;" type="image/svg+xml"&gt;b&lt;/object&gt; and polynomial &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/29085125ec39218fe745bc4bf6f2aac269a42442.svg" style="height: 19px;" type="image/svg+xml"&gt;\[[a+b]\cdot p(x)=a\cdot p(x) + b\cdot p(x)\]&lt;/object&gt;
&lt;/div&gt;
&lt;div class="section" id="linear-independence-span-and-basis"&gt;
&lt;h2&gt;Linear independence, span and basis&lt;/h2&gt;
&lt;p&gt;Since we’ve shown that polynomials in &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; form a
vector space, we can now build additional linear algebraic definitions
on top of that.&lt;/p&gt;
&lt;p&gt;A set of &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/13fbd79c3d390e5d6585a21e11ff5ec1970cff0c.svg" style="height: 12px;" type="image/svg+xml"&gt;k&lt;/object&gt; polynomials &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/5375fa35be4aee1860db8d7f943aa13b9a6983c2.svg" style="height: 19px;" type="image/svg+xml"&gt;p_k(x)\in P_n(\mathbb{R})&lt;/object&gt; is said
to be &lt;em&gt;linearly independent&lt;/em&gt; if&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/11f5abb0d3349e337653153e9932b683f33cf5f3.svg" style="height: 53px;" type="image/svg+xml"&gt;\[\sum_{i=1}^{k}a_i p_i(x)=0\]&lt;/object&gt;
&lt;p&gt;implies &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/936812b8cbbba06fa5948bf8f9393e0bd9abc223.svg" style="height: 15px;" type="image/svg+xml"&gt;a_i=0 \quad \forall i&lt;/object&gt;. In words, the only linear
combination resulting in the zero vector is when all coefficients are 0.&lt;/p&gt;
&lt;p&gt;As an example, let’s discuss the fundamental building blocks of
polynomials in &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;: the set
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/dd30851f93c58a2397e24da06afb6c865f1fd4d6.svg" style="height: 20px;" type="image/svg+xml"&gt;\{1, x, x^2, \dots x^n\}&lt;/object&gt;. These are linearly independent
because:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9a1a0398b10978c0bb81a5ee5e9392e12c3fb031.svg" style="height: 19px;" type="image/svg+xml"&gt;\[a_0 + a_1 x + a_2 x^2 + \cdots a_n x^n=0\]&lt;/object&gt;
&lt;p&gt;is true only for zero polynomial, in which all the coefficients
&lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/fc7c49cdcf0b6f4f244a8e180fc8df4513e6a42e.svg" style="height: 15px;" type="image/svg+xml"&gt;a_i=0&lt;/object&gt;. This comes from the very definition of polynomials.
Moreover, this set &lt;em&gt;spans&lt;/em&gt; the entire &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; because
every polynomial can be (by definition) expressed as a linear combination of
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/dd30851f93c58a2397e24da06afb6c865f1fd4d6.svg" style="height: 20px;" type="image/svg+xml"&gt;\{1, x, x^2, \dots x^n\}&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;Since we’ve shown these basic polynomials are linearly independent and
span the entire vector space, they are a &lt;em&gt;basis&lt;/em&gt; for the space. In fact,
this set has a special name: the &lt;em&gt;monomial basis&lt;/em&gt; (because a monomial is
a polynomial with a single term).&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="checking-if-an-arbitrary-set-of-polynomials-is-a-basis"&gt;
&lt;h2&gt;Checking if an arbitrary set of polynomials is a basis&lt;/h2&gt;
&lt;p&gt;Suppose we have some set polynomials, and we want to know if these form
a basis for &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;. How do we go about it?&lt;/p&gt;
&lt;p&gt;The idea is using linear algebra the same way we do for any other vector
space. Let’s use a concrete example to demonstrate:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/0a1ff12f973cf607bf879deb912333051a671de5.svg" style="height: 22px;" type="image/svg+xml"&gt;\[Q=\{1-x, x, 2x+x^2\}\]&lt;/object&gt;
&lt;p&gt;Is the set &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/c3156e00d3c2588c639e0d3cf6821258b05761c7.svg" style="height: 16px;" type="image/svg+xml"&gt;Q&lt;/object&gt; a basis for &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;? We’ll start by
checking whether the members of &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/c3156e00d3c2588c639e0d3cf6821258b05761c7.svg" style="height: 16px;" type="image/svg+xml"&gt;Q&lt;/object&gt; are linearly independent.
Write:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/41bdcc040d729f93edeb130b51a9ac3ee66b72d5.svg" style="height: 22px;" type="image/svg+xml"&gt;\[a_0(1-x)+a_1 x + a_2(2x+x^2)=0\]&lt;/object&gt;
&lt;p&gt;By regrouping, we can turn this into:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/73036a5532059c810bb37bd3a8dc1abdb85c9b1f.svg" style="height: 22px;" type="image/svg+xml"&gt;\[a_0 + (a_1-a_0+2a_2)x+a_2 x^2=0\]&lt;/object&gt;
&lt;p&gt;For this to be true, the coefficient of each monomial has to be zero;
mathematically:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/4e5d5010eec23b675caf4b1936082d48a0b9ec1f.svg" style="height: 65px;" type="image/svg+xml"&gt;\[\begin{aligned}
    a_0&amp;amp;=0\\
    a_1-a_0+2a_2&amp;amp;=0\\
    a2&amp;amp;=0\\
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;In matrix form:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/78c214c04a41b196844997d84ceeb8df0e5d76e2.svg" style="height: 64px;" type="image/svg+xml"&gt;\[\begin{bmatrix}
    1 &amp;amp; 0 &amp;amp; 0\\
    -1 &amp;amp; 1 &amp;amp; 2\\
    0 &amp;amp; 0 &amp;amp; 1\\
\end{bmatrix}
\begin{bmatrix}a_0\\ a_1\\ a_2\end{bmatrix}
=\begin{bmatrix}0\\ 0\\ 0\end{bmatrix}\]&lt;/object&gt;
&lt;p&gt;We know how to solve this, by reducing the matrix into &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Row_echelon_form"&gt;row-echelon
form&lt;/a&gt;. It’s easy to
see that the reduced row-echelon form of this specific matrix is
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/ca73ab65568cd125c2d27a22bbd9e863c10b675d.svg" style="height: 12px;" type="image/svg+xml"&gt;I&lt;/object&gt;, the identity matrix. Therefore, this set of equations has a
single solution: &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/936812b8cbbba06fa5948bf8f9393e0bd9abc223.svg" style="height: 15px;" type="image/svg+xml"&gt;a_i=0 \quad \forall i&lt;/object&gt;  &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We’ve shown that the set &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/c3156e00d3c2588c639e0d3cf6821258b05761c7.svg" style="height: 16px;" type="image/svg+xml"&gt;Q&lt;/object&gt; is linearly independent. Now let’s
show that it &lt;em&gt;spans&lt;/em&gt; the space &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;. We want to
analyze:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/4ddca074738e89bc063a29a7cd67215ada8fffe5.svg" style="height: 22px;" type="image/svg+xml"&gt;\[a_0(1-x)+a_1 x + a_2(2x+x^2)=\alpha +\beta x + \gamma x^2\]&lt;/object&gt;
&lt;p&gt;And find the coefficients &lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/1ba9b59bdee92f38c1698c784b67ba70f803331d.svg" style="height: 11px;" type="image/svg+xml"&gt;a_i&lt;/object&gt; that satisfy this for any
arbitrary &lt;img alt="\alpha" class="valign-0" src="https://eli.thegreenplace.net/images/math/f7c665b45932a814215e979bc2611080b4948e68.png" style="height: 8px;" /&gt;, &lt;img alt="\beta" class="valign-m4" src="https://eli.thegreenplace.net/images/math/6499d503bfc00cadae1440b191c52a8632e2f8c4.png" style="height: 16px;" /&gt; and &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/67833ee2012ec1c6254b6c009dc72bf0dc48aa6d.svg" style="height: 12px;" type="image/svg+xml"&gt;\gamma&lt;/object&gt;. We proceed
just as before, by regrouping on the left side:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/7bcb2f196e39a4b859cddf4123ee02481940fafa.svg" style="height: 22px;" type="image/svg+xml"&gt;\[a_0 + (a_1-a_0+2a_2)x+a_2 x^2=\alpha +\beta x + \gamma x^2\]&lt;/object&gt;
&lt;p&gt;and equating the coefficient of each power of &lt;img alt="x" class="valign-0" src="https://eli.thegreenplace.net/images/math/11f6ad8ec52a2984abaafd7c3b516503785c2072.png" style="height: 8px;" /&gt; separately:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/f32a037534eba0f569a01b0d2e763b9ff3c54587.svg" style="height: 65px;" type="image/svg+xml"&gt;\[\begin{aligned}
    a_0&amp;amp;=\alpha\\
    a_1-a_0+2a_2&amp;amp;=\beta\\
    a2&amp;amp;=\gamma\\
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;If we turn this into matrix form, the matrix of coefficients is exactly
the same as before. So we know there’s a single solution, and by
rearranging the matrix into &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/ca73ab65568cd125c2d27a22bbd9e863c10b675d.svg" style="height: 12px;" type="image/svg+xml"&gt;I&lt;/object&gt;, the solution will appear on the
right hand side. It doesn’t matter for the moment what the actual
solution is, as long as it exists and is unique. We’ve shown that
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/c3156e00d3c2588c639e0d3cf6821258b05761c7.svg" style="height: 16px;" type="image/svg+xml"&gt;Q&lt;/object&gt; spans the space!&lt;/p&gt;
&lt;p&gt;Since the set &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/c3156e00d3c2588c639e0d3cf6821258b05761c7.svg" style="height: 16px;" type="image/svg+xml"&gt;Q&lt;/object&gt; is linearly independent and spans
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;, it is a &lt;em&gt;basis&lt;/em&gt; for the space.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="inner-product"&gt;
&lt;h2&gt;Inner product&lt;/h2&gt;
&lt;p&gt;I’ve discussed inner products for functions in &lt;a class="reference external" href="https://eli.thegreenplace.net/2025/hilbert-space-treating-functions-as-vectors/"&gt;the post about Hilbert
space&lt;/a&gt;.
Well, &lt;em&gt;polynomials are functions&lt;/em&gt;, so we can define an inner product
using integrals as follows &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/a7073ffb2e322d7c7b9118b0fcc517c0c484fcb3.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\langle p, q \rangle = \int_{a}^{b} p(x) q(x) w(x) \, dx\]&lt;/object&gt;
&lt;p&gt;Where the bounds &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt; and &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98.svg" style="height: 13px;" type="image/svg+xml"&gt;b&lt;/object&gt; are arbitrary, and could be
infinite. Whenever we deal with integrals we worry about convergence; in
my post on Hilbert spaces, we only talked about &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; - the square
integrable functions. Most polynomials are not square integrable,
however. Therefore, we can restrict this using either:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;A special &lt;em&gt;weight function&lt;/em&gt; &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/02e80b7cfa178127b4c5282079352105a2c55cf7.svg" style="height: 19px;" type="image/svg+xml"&gt;w(x)&lt;/object&gt; to make sure the inner
product integral converges&lt;/li&gt;
&lt;li&gt;Set finite bounds on the integral, and then we can just set
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/62c32bb13769e79e5a65343da35baaa514b873e3.svg" style="height: 19px;" type="image/svg+xml"&gt;w(x)=1&lt;/object&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s use the latter, and restrict the bounds into the range
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/d5610f38fe75d8d90ac09fd335bad6823492589d.svg" style="height: 18px;" type="image/svg+xml"&gt;[-1,1]&lt;/object&gt;, setting &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/62c32bb13769e79e5a65343da35baaa514b873e3.svg" style="height: 19px;" type="image/svg+xml"&gt;w(x)=1&lt;/object&gt;. We have the following inner
product:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/9f9575d1c81856a7fd8b8b71352a33119fe023e3.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\langle p, q \rangle = \int_{-1}^{1} p(x) q(x) \, dx\]&lt;/object&gt;
&lt;p&gt;Let’s check that this satisfies the inner product space conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conjugate symmetry&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;Since real multiplication is commutative, we can write:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/e5584cd03887f39cb8b7c8cbec3ca64a1d4347da.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\langle p, q \rangle = \int_{-1}^{1} p(x) q(x) \, dx =\int_{-1}^{1} q(x) p(x) \, dx=\langle q, p \rangle\]&lt;/object&gt;
&lt;p&gt;We deal in the reals here, so we can safely ignore complex conjugation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linearity in the first argument&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;Let &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/c9b738cad43aea3a80a02e1f075a09868eb73e26.svg" style="height: 19px;" type="image/svg+xml"&gt;p_1,p_2,q\in P_n(\mathbb{R})&lt;/object&gt; and &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/44a2f2358e9ec13e88b5a266ee1bd804a43c5864.svg" style="height: 16px;" type="image/svg+xml"&gt;a,b\in \mathbb{R}&lt;/object&gt;.
We want to show that&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/d25bedb6d14cf20fbf28b583412dfd9ac14f6cf3.svg" style="height: 19px;" type="image/svg+xml"&gt;\[\langle ap_1+bp_2,q \rangle = a\langle p_1,q\rangle +b\langle p_2,q\rangle\]&lt;/object&gt;
&lt;p&gt;Expand the left-hand side using our definition of inner product:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/3549d9fc0999c0fadd0458ece4609f5b08780052.svg" style="height: 95px;" type="image/svg+xml"&gt;\[\begin{aligned}
    \langle ap_1+bp_2,q \rangle&amp;amp;=\int_{-1}^{1} (a p_1(x)+b p_2(x)) q(x) \, dx\\
    &amp;amp;=a\int_{-1}^{1} p_1(x) q(x) \, dx+b\int_{-1}^{1} p_2(x) q(x) \, dx
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;The result is equivalent to
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7239344e074d1e86c42e73c5470999705411aacc.svg" style="height: 19px;" type="image/svg+xml"&gt;a\langle p_1,q\rangle +b\langle p_2,q\rangle&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Positive-definiteness&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;We want to show that for nonzero &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/8a22a4ced0990db5e8d521832fddacd240fc599f.svg" style="height: 19px;" type="image/svg+xml"&gt;p\in P_n(\mathbb{R})&lt;/object&gt;, we have
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/c8d11b2d1c5d8a6529b6068917407194be2a7b2d.svg" style="height: 19px;" type="image/svg+xml"&gt;\langle p, p\rangle &amp;gt; 0&lt;/object&gt;. First of all, since &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/5e91383f4084e35136e78c00570cf313965465e9.svg" style="height: 20px;" type="image/svg+xml"&gt;p(x)^2\geq0&lt;/object&gt;
for all &lt;img alt="x" class="valign-0" src="https://eli.thegreenplace.net/images/math/11f6ad8ec52a2984abaafd7c3b516503785c2072.png" style="height: 8px;" /&gt;, it’s true that:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/44dd56558739e571dee1200a50039374e485e05b.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\langle p, p\rangle=\int_{-1}^{1}p(x)^2\, dx\geq 0\]&lt;/object&gt;
&lt;p&gt;What about the result 0 though? Well, let’s say that&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/31b9ef374de82013e825fc205670d3ebe0dfc541.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\int_{-1}^{1}p(x)^2\, dx=0\]&lt;/object&gt;
&lt;p&gt;Since &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/11bf1f75edece9993925ac94a0a1f158762c2111.svg" style="height: 20px;" type="image/svg+xml"&gt;p(x)^2&lt;/object&gt; is a non-negative function, this means that the
integral of a non-negative function ends up being 0. But &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7f86e6c6bb632c1ca2518f269fc1cc1b6737d4f7.svg" style="height: 19px;" type="image/svg+xml"&gt;p(x)&lt;/object&gt; is
a polynomial, so it’s &lt;em&gt;continuous&lt;/em&gt;, and so is &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/11bf1f75edece9993925ac94a0a1f158762c2111.svg" style="height: 20px;" type="image/svg+xml"&gt;p(x)^2&lt;/object&gt;. If the
integral of a continuous non-negative function is 0, it means the
function itself is 0. Had it been non-zero in any place, the integral
would necessarily have to be positive as well.&lt;/p&gt;
&lt;p&gt;We’ve proven that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/9de22135888f7085d51d8b1cbd987d2c68a050ec.svg" style="height: 19px;" type="image/svg+xml"&gt;\langle p, p\rangle=0&lt;/object&gt; only when &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/516b9783fca517eecbd1d064da2d165310b19759.svg" style="height: 12px;" type="image/svg+xml"&gt;p&lt;/object&gt; is
the zero polynomial. The positive-definiteness condition is satisfied.&lt;/p&gt;
&lt;p&gt;In conclusion, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; along with the inner product
we’ve defined forms an &lt;em&gt;inner product space&lt;/em&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="orthogonality"&gt;
&lt;h2&gt;Orthogonality&lt;/h2&gt;
&lt;p&gt;Now that we have an inner product, we can define orthogonality on
polynomials: two polynomials &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/d2926c1571d4e1b8561077d3308b7cfb6d211e59.svg" style="height: 12px;" type="image/svg+xml"&gt;p,q&lt;/object&gt; are &lt;em&gt;orthogonal&lt;/em&gt; (w.r.t. our
inner product) iff&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/b994221c38da6d42afb973d4318ac59c4940f545.svg" style="height: 44px;" type="image/svg+xml"&gt;\[\langle p,q\rangle=\int_{-1}^{1}p(x)q(x)\, dx=0\]&lt;/object&gt;
&lt;p&gt;Contrary to expectation  &lt;a class="footnote-reference" href="#footnote-4" id="footnote-reference-4"&gt;[4]&lt;/a&gt;, the monomial basis polynomials are &lt;em&gt;not&lt;/em&gt;
orthogonal using our definition of inner product.&lt;/p&gt;
&lt;p&gt;For example, calculating the inner product for &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/356a192b7913b04c54574d18c28d46e6395428ab.svg" style="height: 12px;" type="image/svg+xml"&gt;1&lt;/object&gt; and
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/7046d961a8144b7b2c2da6066849a9f889ff2ac9.svg" style="height: 15px;" type="image/svg+xml"&gt;x^2&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/c6937eec30d2a69a76efdbd489e70d3ad18c0c29.svg" style="height: 47px;" type="image/svg+xml"&gt;\[\langle 1,x^2\rangle=\int_{-1}^{1}x^2\, dx=\frac{x^3}{3}\biggr|_{-1}^{1}=\frac{2}{3}\]&lt;/object&gt;
&lt;p&gt;There are other sets of polynomials that &lt;em&gt;are&lt;/em&gt; orthogonal using our
inner product. For example, the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Legendre_polynomials"&gt;Legendre
polynomials&lt;/a&gt;; but
this is a topic for another post.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;There’s a level of basic algebra below which we won’t descend in
these notes. We could break this statement further down by saying
that something like &lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/bdcf820e656d6ef78cf1729b58556a6a9e2e9fd4.svg" style="height: 21px;" type="image/svg+xml"&gt;a_i x^i + a_j x^j&lt;/object&gt; can be added to
&lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/e2d0aea0ffca9c5c9870d56c7f0247349f759f49.svg" style="height: 21px;" type="image/svg+xml"&gt;b_i x^i + b_j x^j&lt;/object&gt; by adding each power of &lt;img alt="x" class="valign-0" src="https://eli.thegreenplace.net/images/math/11f6ad8ec52a2984abaafd7c3b516503785c2072.png" style="height: 8px;" /&gt;
separately for any &lt;img alt="i" class="valign-0" src="https://eli.thegreenplace.net/images/math/042dc4512fa3d391c5170cf3aa61e6a638f84342.png" style="height: 12px;" /&gt; and &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/5c2dd944dde9e08881bef0894fe7b22a5c9c4b06.svg" style="height: 16px;" type="image/svg+xml"&gt;j&lt;/object&gt;, but let’s just take it
for granted.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Obviously, this specific set of equations is quite trivial to solve
without matrices; I just want to demonstrate the more general
approach. Once we have a system of linear equations, the whole
toolbox of linear algebra is at our disposal. For example, we could
also have checked the determinant and seen it’s non-zero, which means
that a square matrix is invertible, and in this case has a single
solution of zeroes.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;And actually with this (or any valid) inner product,
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt; indeed forms a Hilbert space, because it’s
finite-dimensional, and every finite-dimensional inner product space
is complete.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-4" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Because of how naturally this set spans &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/66e617fc4a3781fe03dcf20effb656feaa81a47e.svg" style="height: 19px;" type="image/svg+xml"&gt;P_n(\mathbb{R})&lt;/object&gt;. And
indeed, we can define alternative inner products using which
monomials are orthogonal.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Math"></category></entry><entry><title>LaTeX, LLMs and Boring Technology</title><link href="https://eli.thegreenplace.net/2025/latex-llms-and-boring-technology/" rel="alternate"></link><published>2025-10-25T06:20:00-07:00</published><updated>2025-10-25T13:23:02-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-10-25:/2025/latex-llms-and-boring-technology/</id><summary type="html">&lt;p&gt;Depending on your particular use case, &lt;a class="reference external" href="https://boringtechnology.club/"&gt;choosing boring technology&lt;/a&gt;
is often a good idea. Recently, I've been thinking more and more about how
the rise and increase in power of LLMs affects this choice.&lt;/p&gt;
&lt;p&gt;By definition, boring technology has been around for a long time. Piles of
content have been …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Depending on your particular use case, &lt;a class="reference external" href="https://boringtechnology.club/"&gt;choosing boring technology&lt;/a&gt;
is often a good idea. Recently, I've been thinking more and more about how
the rise and increase in power of LLMs affects this choice.&lt;/p&gt;
&lt;p&gt;By definition, boring technology has been around for a long time. Piles of
content have been written and produced about it: tutorials, books, videos,
reference manuals, examples, blog posts and so on. All of this is consumed
during the LLM training process, making LLMs better and better at reasoning
about such technology.&lt;/p&gt;
&lt;p&gt;Conversely, &amp;quot;shiny technology&amp;quot; is new, and has much less material available. As
a result, LLMs won't be as familiar with it. This applies to many domains, but
one specific example for me personally is in the context of LaTeX.&lt;/p&gt;
&lt;p&gt;LaTeX certainly fits the &amp;quot;boring technology&amp;quot; bill. It's decades old, and has
been the mainstay of academic writing since the 1980s. When I used it for the
first time in 2002 (for a project report in my university AI class), it was
already very old. But people keep working on it and fixing issues; it's easy to
install and its wealth of capabilities and community size are staggering.
Moreover, people keep working &lt;em&gt;with&lt;/em&gt; it, producing more and more content and
examples the LLMs can ingest and learn from.&lt;/p&gt;
&lt;p&gt;I keep hearing about the advantages of new and shiny systems like
Typst. However, with the help of LLMs, almost none of the advantages seem
meaningful to me. LLMs are &lt;em&gt;great&lt;/em&gt; at LaTeX and help a lot with learning or
remembering the syntax, finding the right packages, deciphering errors and
even generating tedious parts like tables and charts, significantly reducing
the need for scripting &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can use LLMs either as standalone or fully integrated into your LaTeX
environment; Overleaf has a built-in AI helper, and for local editing you can
use VSCode plugins or other tools. I'm personally &lt;a class="reference external" href="https://eli.thegreenplace.net/2025/notes-on-using-latex-to-generate-formulae/"&gt;content with TeXstudio&lt;/a&gt;
and use LLMs as standalone help, but YMMV.&lt;/p&gt;
&lt;p&gt;There are many examples where boring technology and LLMs go well together. The
main criticism of boring technology is typically that it's &amp;quot;too big, full of
cruft, difficult to understand&amp;quot;. LLMs really help cutting through the learning
curve though, and all that &amp;quot;cruft&amp;quot; is very likely to become useful some time in
the future when you graduate from the basic use cases.&lt;/p&gt;
&lt;p&gt;To be clear: Typst looks really cool, and kudos to the team behind it! All I'm
saying in this post is that for me - personaly - the choice for now is to stick
with LaTeX as a &amp;quot;boring technology&amp;quot;.&lt;/p&gt;
&lt;div class="section" id="appendix-some-examples-of-llms-being-helpful-with-latex"&gt;
&lt;h2&gt;Appendix: some examples of LLMs being helpful with LaTeX&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p class="first"&gt;For finding the right math symbols, I rarely need to scan reference materials
any longer. LLMs will easily answer questions like &amp;quot;what's that squiggly
Greek letter used in math, and its latex symbol?&amp;quot; or &amp;quot;write the latex for
Green's theorem, integral form&amp;quot;.
For the trickiest / largest equations, LLMs are very good at &amp;quot;here's a picture
I took of my equation, give me its latex code&amp;quot; these days &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&amp;quot;Here's a piece of code and the LaTeX error I'm getting on it; what's wrong?&amp;quot;&lt;/p&gt;
&lt;p&gt;This is made more ergonomic by editor integrations, but I personally find that
LaTeX's error message problem is hugely overblown. 95% of the errors are
reasonably clear, and serious sleuthing is only rarely required in practice.
In that minority of cases, pasting some code and the error into a standalone
LLM isn't a serious time drain.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Generating TikZ diagrams and plots. For this, the hardest part is getting
started and finding the right element names, and so on. It's very useful to
just ask an LLM to emit something initial and then tweak it manually later,
as needed. You can also ask the LLM to explain each thing it emits in
detail - this is a great learning tool for deeper understanding.&lt;/p&gt;
&lt;p&gt;Recently I had luck going &amp;quot;meta&amp;quot; with this: when the diagram has repetitive
elements, I may ask the LLM to &amp;quot;write a Python program that generates a TikZ
diagram ...&amp;quot;, and it works well.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Generating and populating tables, and converting them from other data
formats or screenshots.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Help with formatting and typesetting (how do I change margins to XXX and
spacing to YYY).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;When it comes to scripting, I generally prefer sticking to real
programming languages anyway. If there's anything non-trivial to
auto-generate I wouldn't use a LaTeX macro, but would write a Python
program to generate whatever I need and embed it into the document
with something like &lt;tt class="docutils literal"&gt;\input{}&lt;/tt&gt;.&lt;/p&gt;
&lt;p class="last"&gt;Typst's scripting system may be marketed as &amp;quot;clean and powerful&amp;quot;, but
why learn yet another scripting language?&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;Ignoring LaTeX's equation notation and doing their own thing is one of
the biggest mistakes Typst makes, in my opinion. LaTeX's notation may
not be perfect, but it's near universal at this point with support in
almost all math-aware tools.&lt;/p&gt;
&lt;p class="last"&gt;Typst's math mode is a clear sign of the second system effect, and isn't
&lt;a class="reference external" href="https://laurmaedje.github.io/posts/math-mode-problem/"&gt;even stable&lt;/a&gt;.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Math"></category><category term="Blogging"></category><category term="Software &amp; Tools"></category></entry><entry><title>Notes on using LaTeX to generate formulae</title><link href="https://eli.thegreenplace.net/2025/notes-on-using-latex-to-generate-formulae/" rel="alternate"></link><published>2025-10-11T08:13:00-07:00</published><updated>2025-10-11T15:15:50-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-10-11:/2025/notes-on-using-latex-to-generate-formulae/</id><summary type="html">&lt;p&gt;This post collects some notes on using LaTeX to render mathematical documents
and formulae, mostly focused on a Linux machine. For background, I typically
use LaTeX for one of two (related) purposes:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Render math for my blog posts, which are usually written
&lt;a class="reference external" href="https://eli.thegreenplace.net/2014/blogging-setup-with-pelican/"&gt;using reStructuredText&lt;/a&gt;.
This sometimes includes diagrams generated using …&lt;/li&gt;&lt;/ol&gt;</summary><content type="html">&lt;p&gt;This post collects some notes on using LaTeX to render mathematical documents
and formulae, mostly focused on a Linux machine. For background, I typically
use LaTeX for one of two (related) purposes:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Render math for my blog posts, which are usually written
&lt;a class="reference external" href="https://eli.thegreenplace.net/2014/blogging-setup-with-pelican/"&gt;using reStructuredText&lt;/a&gt;.
This sometimes includes diagrams generated using TikZ.&lt;/li&gt;
&lt;li&gt;Write personal (unpublished) notes on math-y subjects entirely in LaTeX.
These are typically short (up to 10-20 pages), single-subject booklets.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I don't currently use LaTeX for either precise typesetting or for
authoring very large, book-sized documents.&lt;/p&gt;
&lt;div class="section" id="editing"&gt;
&lt;h2&gt;Editing&lt;/h2&gt;
&lt;p&gt;For day-to-day authoring, I find &lt;a class="reference external" href="https://www.texstudio.org/"&gt;TeXstudio&lt;/a&gt; to be
excellent. It has everything I need for local editing with a convenient preview
window. I really like that TeXstudio doesn't hide the fact that it's just a
graphical veneer on top of command-line LaTeX tooling, and lets you examine what
it's doing through logs.&lt;/p&gt;
&lt;p&gt;Note that web-based solutions like Overleaf exist; I can see myself using that,
especially if collaborating with others or having to author LaTeX from a
diverse set of computers and OSes, but for local editing of git-backed text
files, TeXstudio is great.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="converting-to-other-formats-with-pandoc"&gt;
&lt;h2&gt;Converting to other formats with pandoc&lt;/h2&gt;
&lt;p&gt;&lt;a class="reference external" href="https://pandoc.org/"&gt;pandoc&lt;/a&gt; is very capable for converting documents from
LaTeX to other formats. Recently I find that it's easier to write math-heavy
blog posts in LaTeX, and then convert them to reStructuredText with &lt;tt class="docutils literal"&gt;pandoc&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;For example, the recent &lt;a class="reference external" href="https://eli.thegreenplace.net/2025/hilbert-space-treating-functions-as-vectors/"&gt;post on Hilbert spaces&lt;/a&gt;
was written like this and then converted using this command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pandoc -f latex -s -t rst hilbert.tex
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The resulting reStructuredText is very readable and requires very little
tweaking before final publishing. &lt;tt class="docutils literal"&gt;pandoc&lt;/tt&gt; supports many formats, so if you
use Markdown or something else, it should work similarly well.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="rendering-standalone-formulae-or-diagrams"&gt;
&lt;h2&gt;Rendering standalone formulae or diagrams&lt;/h2&gt;
&lt;p&gt;A useful feature of LaTeX tooling is the ability to render a specific formula
in standalone mode to an image. We can write the formula into its own file
(call it &lt;tt class="docutils literal"&gt;standaloneformula.tex&lt;/tt&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;\documentclass&lt;/span&gt;&lt;span class="na"&gt;[preview]&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;standalone&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;\usepackage&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;amsmath&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;\begin&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;document&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;span class="s"&gt;\(&lt;/span&gt;&lt;span class="nb"&gt;&lt;/span&gt;
&lt;span class="nv"&gt;\int&lt;/span&gt;&lt;span class="nb"&gt;_{&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;\infty&lt;/span&gt;&lt;span class="nb"&gt;}^&lt;/span&gt;&lt;span class="nv"&gt;\infty&lt;/span&gt;&lt;span class="nb"&gt; e^{&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nb"&gt;x^&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;&lt;span class="nv"&gt;\,&lt;/span&gt;&lt;span class="nb"&gt;dx&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;\sqrt&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;\pi&lt;/span&gt;&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;span class="s"&gt;\)&lt;/span&gt;
&lt;span class="k"&gt;\end&lt;/span&gt;&lt;span class="nb"&gt;{&lt;/span&gt;document&lt;span class="nb"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In case you were wondering, this is the Gaussian integral:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/1a96fcbd751829566b9da856edd88c2916739440.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^\infty e^{-x^2}\,dx=\sqrt{\pi}\]&lt;/object&gt;
&lt;p&gt;Once we have that standalone &lt;tt class="docutils literal"&gt;.tex&lt;/tt&gt; file, there's a number of things we can
do. First, the &lt;a class="reference external" href="https://www.tug.org/texlive/"&gt;texlive&lt;/a&gt; package should be
installed &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;. Using &lt;tt class="docutils literal"&gt;apt&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ sudo apt install texlive-full
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, we can run the tools from &lt;tt class="docutils literal"&gt;texlive&lt;/tt&gt;, for example &lt;tt class="docutils literal"&gt;pdflatex&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pdflatex standaloneformula.tex
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This creates a PDF file that's useful for previews. To convert the &lt;tt class="docutils literal"&gt;.tex&lt;/tt&gt; file
to an image in SVG format, we'll use a two-step process:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ latex standaloneformula.tex

... generates standaloneformula.dvi

$ dvisvgm standaloneformula.dvi

... generates standaloneformula.svg
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If we want a PNG file instead of SVG:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ dvipng -D 300 standaloneformula.dvi
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;tt class="docutils literal"&gt;latexmk&lt;/tt&gt; tool can build a &lt;tt class="docutils literal"&gt;.tex&lt;/tt&gt; file into a PDF whenever the input
file changes, so running:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ latexmk -pvc standaloneformula.tex
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And opening the PDF in a separate window, we can observe live refreshes of
edits without having to recompile explicitly. While useful in some scenarios,
I find that TeXstudio already does this well.&lt;/p&gt;
&lt;p&gt;The same tooling flow works for TikZ diagrams! A standalone LaTeX document
containing a single &lt;tt class="docutils literal"&gt;tikzpicture&lt;/tt&gt; element can also be rendered to a SVG or
PNG using the same exact commands.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="docker-versions"&gt;
&lt;h2&gt;Docker versions&lt;/h2&gt;
&lt;p&gt;If you'd rather not install all these tools directly but use Docker instead,
the &lt;tt class="docutils literal"&gt;texlive&lt;/tt&gt; image can be used to do the same things:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ docker pull texlive/texlive:latest
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And now we can use the same invocations, just through docker:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ docker run --rm -u $(id -u):$(id -g) \
    -v &amp;quot;$PWD&amp;quot;:/workdir -w /workdir texlive/texlive:latest \
    latex standaloneformula.tex

$ docker run --rm -u $(id -u):$(id -g) \
    -v &amp;quot;$PWD&amp;quot;:/workdir -w /workdir texlive/texlive:latest \
    dvisvgm standaloneformula.dvi
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="positioning-of-formulae-embedded-in-text"&gt;
&lt;h2&gt;Positioning of formulae embedded in text&lt;/h2&gt;
&lt;p&gt;When a formula like &lt;object class="valign-m7" data="https://eli.thegreenplace.net/images/math/58d3c40cf29ef7230b9016e5b256cd8d9cdfb9c8.svg" style="height: 23px;" type="image/svg+xml"&gt;\frac{n+1}{n^2-1}&lt;/object&gt; is embedded in text, it should
be aligned properly to look good with the surrounding text. The information
required to do this is emitted by tools like &lt;tt class="docutils literal"&gt;dvisvgm&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;dvipng&lt;/tt&gt;; for
example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ dvisvgm standaloneformula.dvi
pre-processing DVI file (format version 2)
processing page 1
  computing extents based on data set by preview package (version 14.0.6)
  width=81.267395pt, height=9.86894pt, depth=4.388947pt
  graphic size: 81.267395pt x 14.257887pt (28.562223mm x 5.011074mm)
  output written to standaloneformula.svg
1 of 1 page converted in 0.147623 seconds
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;height=...,&lt;/span&gt; &lt;span class="pre"&gt;depth=...&lt;/span&gt;&lt;/tt&gt; line in the output. The height is the total
height of the formula, and depth is its height below the &amp;quot;baseline&amp;quot; (how much
down it should stick out from the line). In my blog, these two are translated
to attributes on the image element embedding the SVG. Height is translated to
&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;style=&amp;quot;height:&lt;/span&gt; ...&lt;/tt&gt; and depth to &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;vertical-align:&lt;/span&gt; ...&lt;/tt&gt;.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;If the machine already has TeXstudio installed, &lt;tt class="docutils literal"&gt;texlive&lt;/tt&gt; is almost
certainly installed as well.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Math"></category><category term="Blogging"></category><category term="Software &amp; Tools"></category></entry><entry><title>Hilbert space: treating functions as vectors</title><link href="https://eli.thegreenplace.net/2025/hilbert-space-treating-functions-as-vectors/" rel="alternate"></link><published>2025-09-06T06:46:00-07:00</published><updated>2025-09-06T13:47:12-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-09-06:/2025/hilbert-space-treating-functions-as-vectors/</id><summary type="html">&lt;p&gt;The tools of linear algebra are extremely useful when working in
Euclidean space (e.g. &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/b15d4bbfe66586a67fc56425a1b94e0466f3e319.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{R}^3&lt;/object&gt;). Wouldn’t it be great if we
could apply these tools to additional mathematical constructs, such as
functions and sequences? &lt;em&gt;Hilbert space&lt;/em&gt; allows us to do exactly this -
apply linear algebra to …&lt;/p&gt;</summary><content type="html">&lt;p&gt;The tools of linear algebra are extremely useful when working in
Euclidean space (e.g. &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/b15d4bbfe66586a67fc56425a1b94e0466f3e319.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{R}^3&lt;/object&gt;). Wouldn’t it be great if we
could apply these tools to additional mathematical constructs, such as
functions and sequences? &lt;em&gt;Hilbert space&lt;/em&gt; allows us to do exactly this -
apply linear algebra to functions.&lt;/p&gt;
&lt;div class="section" id="intuition-functions-as-infinite-dimensional-vectors"&gt;
&lt;h2&gt;Intuition - functions as infinite-dimensional vectors&lt;/h2&gt;
&lt;p&gt;There are several ways to view vectors; a standard interpretation is an
ordered list of numbers. Let’s take a vector in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/b15d4bbfe66586a67fc56425a1b94e0466f3e319.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{R}^3&lt;/object&gt; as
an example:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/e4673af52ff6d5413245f967da9e1a23c10cb45b.svg" style="height: 64px;" type="image/svg+xml"&gt;\[v = \begin{bmatrix}
    1.4 \\
    4.2 \\
    -2.14
\end{bmatrix}\]&lt;/object&gt;
&lt;p&gt;This is a list of three numbers, where each number has an index.
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/1cb91fd50846c020fedc5601e517f27bcfa443fc.svg" style="height: 18px;" type="image/svg+xml"&gt;v[1]&lt;/object&gt; is 1.4, &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/adfcd2a13fac61fe2db94a69a016299e7bc4029f.svg" style="height: 18px;" type="image/svg+xml"&gt;v[2]&lt;/object&gt; is 4.2 and so on. Another way to think
of a vector is a &lt;em&gt;function&lt;/em&gt;, in the strict &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Function_(mathematics)"&gt;mathematical
sense&lt;/a&gt;. A
vector in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/b15d4bbfe66586a67fc56425a1b94e0466f3e319.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{R}^3&lt;/object&gt; is a function with the domain
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/2ef79836e9815867d007622ccaa63c42e8f5e8bb.svg" style="height: 16px;" type="image/svg+xml"&gt;{1,2,3}&lt;/object&gt; (the indices) and codomain &lt;img alt="\mathbb{R}" class="valign-0" src="https://eli.thegreenplace.net/images/math/0ed839b111fe0e3ca2b2f618b940893eaea88a57.png" style="height: 12px;" /&gt;, or:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/cc7e901a566d8e75686e1086d7f1a6d66d4abc89.svg" style="height: 19px;" type="image/svg+xml"&gt;\[v:\{1,2,3\}\to\mathbb{R}\]&lt;/object&gt;
&lt;p&gt;Now imagine that our vector is N-dimensional: &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e1148fdc3fe870941d210c6bd3ed4433fcea9ec9.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{R}^N&lt;/object&gt;.
Using the function notation we can write
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/1f6a2dbaee89fb8be9d36f8f9ce13718d5c11b44.svg" style="height: 19px;" type="image/svg+xml"&gt;v:\{1,2,\cdots ,N\}\to\mathbb{R}&lt;/object&gt;. This works for any N, and in
fact it also works for an &lt;em&gt;infinite&lt;/em&gt; N. Our vector then simply becomes a
function from the natural numbers to the reals:
&lt;object class="valign-m1" data="https://eli.thegreenplace.net/images/math/117066b338baa9086d31346b3f2ae7af2ead82b3.svg" style="height: 13px;" type="image/svg+xml"&gt;v:\mathbb{N}\to\mathbb{R}&lt;/object&gt;.&lt;/p&gt;
&lt;p&gt;But we can take it even further; what if we allow any real number as an
index? Our vector is then &lt;object class="valign-m1" data="https://eli.thegreenplace.net/images/math/d6c2f4f20915089ef51c012adf4c232931648744.svg" style="height: 13px;" type="image/svg+xml"&gt;v:\mathbb{R}\to\mathbb{R}&lt;/object&gt;, or we may
just change its name to be more familiar:
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/77d5e13c951f5a369089f05a0505728896773ed3.svg" style="height: 16px;" type="image/svg+xml"&gt;f:\mathbb{R}\to\mathbb{R}&lt;/object&gt;. This &amp;quot;vector&amp;quot; is just a function from
the reals to the reals.&lt;/p&gt;
&lt;p&gt;While we
can’t write all the elements down explicitly (there’s an infinite number
of them, and most of the indices are irrational which don’t even have a
finite representation), we can instead come up with a rule that maps an
index to the element. For example: &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/7d1c3366e098fccd9f305812881c3f7e1bef6f41.svg" style="height: 20px;" type="image/svg+xml"&gt;f(x)=x^2&lt;/object&gt; is such a rule. For
any given index &lt;em&gt;x&lt;/em&gt;, it assigns the value &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/7046d961a8144b7b2c2da6066849a9f889ff2ac9.svg" style="height: 15px;" type="image/svg+xml"&gt;x^2&lt;/object&gt;. We’re not used to
thinking of functions as vectors, but if we carefully extend some
definitions, it’s entirely possible!&lt;/p&gt;
&lt;p&gt;So, functions can be seen as vectors with infinite dimensions. The next
step is to see how we can define a &lt;em&gt;vector space&lt;/em&gt; for functions.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="functions-form-a-vector-space"&gt;
&lt;h2&gt;Functions form a vector space&lt;/h2&gt;
&lt;p&gt;Functions, together with the standard addition and scalar multiplication
operations form a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Vector_space"&gt;vector
space&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For full generality, let &lt;img alt="X" class="valign-0" src="https://eli.thegreenplace.net/images/math/c032adc1ff629c9b66f22749ad667e6beadf144b.png" style="height: 12px;" /&gt; be any set and
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/5be84a594fbf39ae450f212f3f805bd2be92532e.svg" style="height: 19px;" type="image/svg+xml"&gt;\mathbb{F}\in\{\mathbb{R},\mathbb{C}\}&lt;/object&gt; (either reals or complex
numbers). Let &lt;img alt="V" class="valign-0" src="https://eli.thegreenplace.net/images/math/c9ee5681d3c59f7541c27a38b67edf46259e187b.png" style="height: 12px;" /&gt; be the set of all functions mapping
&lt;object class="valign-m1" data="https://eli.thegreenplace.net/images/math/9b25e0cdc903e7cf97674ddda4049787176cb8b9.svg" style="height: 13px;" type="image/svg+xml"&gt;X\to\mathbb{F}&lt;/object&gt;. For &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/f9806c8ce4015a0bc3c7793b3b5c748a171dbad2.svg" style="height: 16px;" type="image/svg+xml"&gt;f,g\in V&lt;/object&gt; and a number
&lt;object class="valign-m1" data="https://eli.thegreenplace.net/images/math/f6c9b0086dec62611aa75fec393e7d24bdc5e5c1.svg" style="height: 13px;" type="image/svg+xml"&gt;a \in \mathbb{F}&lt;/object&gt;, we define function addition and scalar
multiplication as follows:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/98cc1c7de04048e8a418c0511bee58303001d07c.svg" style="height: 19px;" type="image/svg+xml"&gt;\[[f+g](x)=f(x)+g(x)\qquad [a\cdot f](x)=a\cdot f(x)\]&lt;/object&gt;
&lt;p&gt;Then &lt;img alt="V" class="valign-0" src="https://eli.thegreenplace.net/images/math/c9ee5681d3c59f7541c27a38b67edf46259e187b.png" style="height: 12px;" /&gt; along with these operations forms a vector space over
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/7f67a263e95d4b4c6b61e9db27e9242368f006db.svg" style="height: 12px;" type="image/svg+xml"&gt;\mathbb{F}&lt;/object&gt;. For a proof, see Appendix A.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="square-integrable-functions"&gt;
&lt;h2&gt;Square integrable functions&lt;/h2&gt;
&lt;p&gt;A vector space is useful, but to get to Hilbert space and be able to do
more interesting operations on functions, we need some additional
structure.&lt;/p&gt;
&lt;p&gt;From here on, we’ll switch to functions with complex values (functions
with real values are just a special case). A function
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/248658124ff8804a7e6b06a400db22cf503af36e.svg" style="height: 16px;" type="image/svg+xml"&gt;f:\mathbb{R}\to\mathbb{C}&lt;/object&gt; is said to be &lt;em&gt;square integrable&lt;/em&gt; if:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/5970ff73ce1b91e1a68cefceb0e341764ef6d8e9.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}\left | f(x) \right |^2 dx &amp;lt; \infty\]&lt;/object&gt;
&lt;p&gt;The set of such functions is commonly denoted &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;, and it forms
a &lt;em&gt;subspace&lt;/em&gt; of the vector space we discussed in the previous section
(for a proof, see Appendix B).&lt;/p&gt;
&lt;p&gt;The integral over the square of the function is equivalent to the
Euclidean norm for vectors; intuitively, it acts as a measure of
&lt;em&gt;length&lt;/em&gt;, which is the term used in vectors. For functions, it’s
typically referred to as &lt;em&gt;energy&lt;/em&gt;  &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="inner-product-and-norm"&gt;
&lt;h2&gt;Inner product and norm&lt;/h2&gt;
&lt;p&gt;To add more tools from the linear algebra toolbox, let’s define an inner
product on &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/93020dafe9c24fd4ca6baa669e2082ff3a5fd989.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\langle f,g \rangle=\int_{-\infty}^{\infty}f^{*}(x)g(x) dx\]&lt;/object&gt;
&lt;p&gt;Why is it defined in this way? Here’s the definition of inner product
between two N-dimensional vectors with complex values:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/ea48dc8f99d8c915a73e125a73e1893323323586.svg" style="height: 52px;" type="image/svg+xml"&gt;\[\langle u,v \rangle=\sum_{i=1}^{N}u_{i}^{*} v_{i}\]&lt;/object&gt;
&lt;p&gt;Looks familiar? The function version is just the generalization of this
sum over an infinite range (the entire x-axis, if you will), using an
integral.&lt;/p&gt;
&lt;p&gt;As the next step, we want to show that &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; is an &lt;em&gt;inner product
space&lt;/em&gt;, when taken with the inner product operation as defined above.
For this to be true, first and foremost we have to show that the inner
product is finite for every pair of functions in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; (if the
integral doesn’t converge, it’s not something we can work with). This
can be done using the integral form of the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Cauchy%E2%80%93Schwarz_inequality"&gt;Cauchy-Schwarz
inequality&lt;/a&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/3aa04f91a97522e0e9b7ffbfa4f5031dd42275b4.svg" style="height: 54px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}f^{*}(x)g(x) dx \leq
\sqrt{\int_{-\infty}^{\infty}|f(x)|^2} dx
\sqrt{\int_{-\infty}^{\infty}|g(x)|^2} dx\]&lt;/object&gt;
&lt;p&gt;Since &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/f91319603c5b4d40a030bde9928e0dad2e302003.svg" style="height: 19px;" type="image/svg+xml"&gt;f,g\in L^2&lt;/object&gt;, the right hand side is finite, and therefore
the inner product is finite as well. This is where the square
integrability of functions in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; comes into play - without
being square integrable, the inner product would be impossible to
define.&lt;/p&gt;
&lt;p&gt;The other properties of inner products can also be demonstrated readily,
and there are plenty of resources online that show how &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Therefore &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;, coupled with the inner product operation shown
here forms an &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Inner_product_space"&gt;inner product
space&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This inner product can be used to define a &lt;em&gt;norm&lt;/em&gt; for our space:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/af98d2fd9e3781d7048f095031180c158fe90663.svg" style="height: 54px;" type="image/svg+xml"&gt;\[\|f\| = \sqrt{\langle f,f\rangle}
= \sqrt{\int_{-\infty}^{\infty} |f(x)|^2 \, dx}\]&lt;/object&gt;
&lt;p&gt;Once again, because our functions in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; are square integrable,
the norm exists and it’s easy to show it satisfies all the usual
requirements for a norm.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="are-we-hilbert-yet"&gt;
&lt;h2&gt;Are we Hilbert yet?&lt;/h2&gt;
&lt;p&gt;We’ve seen that the set of square integrable functions &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; forms
a proper vector space and also an inner product space when coupled with
an inner product operation; it also has a norm. So does it have all
that’s needed for linear algebra?&lt;/p&gt;
&lt;p&gt;Almost. This space should also be &lt;em&gt;complete&lt;/em&gt;. The term &amp;quot;complete&amp;quot; is
severely overloaded in math, so it’s important to say what it means in
this context: put simply, it means the set has no &amp;quot;holes&amp;quot; - no sequence
of elements in the set converges to an element outside the set &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;. To
put it less simply, a space is complete if all &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Cauchy_sequence"&gt;Cauchy
sequences&lt;/a&gt; of elements
of this space converge.&lt;/p&gt;
&lt;p&gt;This gets us deep into the large and advanced topic of real analysis.
The &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Riesz%E2%80%93Fischer_theorem"&gt;Riesz-Fischer
theorem&lt;/a&gt;
shows that &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; is complete.&lt;/p&gt;
&lt;p&gt;Once we add completeness to the set of properties of &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;, it
becomes a &lt;em&gt;Hilbert space&lt;/em&gt;.&lt;/p&gt;
&lt;img alt="Mugshot of David Hilbert saying &amp;quot;yay&amp;quot;" class="align-center" src="https://eli.thegreenplace.net/images/2025/hilbertyay.png" /&gt;
&lt;p&gt;You may also hear the term &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Banach_space"&gt;Banach
space&lt;/a&gt; mentioned in this
context. Banach spaces are more general than Hilbert spaces: a complete
space with a norm is a Banach space (this norm doesn’t have to come from
an inner product). A complete inner-product space is a Hilbert space -
the norm of a Hilbert space is defined using its inner product, as we’ve
seen above.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="application-generalized-fourier-series"&gt;
&lt;h2&gt;Application: generalized Fourier series&lt;/h2&gt;
&lt;p&gt;The Fourier series is one of the most brilliant and consequential ideas
in mathematics. I would really like to dive deeper into this topic, but
that would require a post (or a book) of its own.&lt;/p&gt;
&lt;p&gt;In short, Fourier series can be defined for functions in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;
because these form a Hilbert space. Specifically, the inner product for
functions lets us define &lt;em&gt;orthogonality&lt;/em&gt; and the concept of &lt;em&gt;basis
vectors&lt;/em&gt; in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;. These are then used to express any function as
a weighted sum of a series of basis functions that span the space.
Moreover, the completeness of the space guarantees that Fourier series
actually converge to functions within the space.&lt;/p&gt;
&lt;p&gt;Interestingly, Fourier put forward his ideas decades before the field of
analysis matured and Hilbert space was defined. This is why many
mathematicians of the day (most notably Lagrange) objected to Fourier
theory as not sufficiently rigorous. The theory worked brilliantly for
many useful scenarios, however, and later developments in functional
analysis helped put it on a more solid theoretical footing.&lt;/p&gt;
&lt;p&gt;Another related example which I find very cool: I’ve mentioned how this
theory helps us apply the tools of linear algebra to functions, and
generalized Fourier series provides an excellent illustration.&lt;/p&gt;
&lt;p&gt;Most people are familiar with the trigonometric Fourier series, but the
theory is more general and applies to any set of mutually orthogonal
functions that form a basis for the vector space. Is there a polynomial
Fourier series? Yes, and it can be derived using one of the classical
tools of linear algebra - the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process"&gt;Gram-Schmidt
process&lt;/a&gt;.
The result is &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Legendre_polynomials"&gt;Legendre
polynomials&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Again, all of this is fascinating and I hope to be able to write more on
this topic in the future.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="application-quantum-mechanics"&gt;
&lt;h2&gt;Application: Quantum mechanics&lt;/h2&gt;
&lt;p&gt;In QM, states of particles are described by wavefunctions in a Hilbert
space. The inner product can be interpreted as a probability. QM
operators can be seen as linear maps on that space. This lets us apply
linear algebra in infinite dimensions and unlocks a treasure chest of
useful mathematical tools.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="appendix-a-proof-of-vector-space-axioms-for-functions"&gt;
&lt;h2&gt;Appendix A: proof of vector space axioms for functions&lt;/h2&gt;
&lt;p&gt;As a reminder, we’re dealing with the set of functions
&lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/f8deb3d27805ed519ff83bbbf8cbe286be14bb71.svg" style="height: 16px;" type="image/svg+xml"&gt;f:X\to\mathbb{F}&lt;/object&gt;, where &lt;img alt="X" class="valign-0" src="https://eli.thegreenplace.net/images/math/c032adc1ff629c9b66f22749ad667e6beadf144b.png" style="height: 12px;" /&gt; is any set and
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/7f67a263e95d4b4c6b61e9db27e9242368f006db.svg" style="height: 12px;" type="image/svg+xml"&gt;\mathbb{F}&lt;/object&gt; can be either &lt;img alt="\mathbb{R}" class="valign-0" src="https://eli.thegreenplace.net/images/math/0ed839b111fe0e3ca2b2f618b940893eaea88a57.png" style="height: 12px;" /&gt; or
&lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/706e4005d6760527e1bd99f36d2e91c4f60ad91f.svg" style="height: 12px;" type="image/svg+xml"&gt;\mathbb{C}&lt;/object&gt;. This set &lt;img alt="V" class="valign-0" src="https://eli.thegreenplace.net/images/math/c9ee5681d3c59f7541c27a38b67edf46259e187b.png" style="height: 12px;" /&gt;, along with addition between set
members and scalar multiplication form a vector space. To prove this, we
prove all the vector space axioms:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Associativity of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/61041b1487a8d01fe193d12d672f09bd3bb28aa8.svg" style="height: 99px;" type="image/svg+xml"&gt;\[\begin{aligned}
[f+[g+h]](x)&amp;amp;=f(x)+[g+h](x)\\
    &amp;amp;=f(x)+g(x)+h(x)\\
    &amp;amp;=[f+g](x)+h(x)\\
    &amp;amp;=[[f+g]+h](x)
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;This proceeds very smoothly because addition on either reals or complex
numbers is associative, commutative, etc.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Commutativity of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/b2e4f1c28718eefb4477a136ccbb9603653022b6.svg" style="height: 72px;" type="image/svg+xml"&gt;\[\begin{aligned}
[f+g](x)&amp;amp;=f(x)+g(x)\\
    &amp;amp;=g(x)+f(x)\\
    &amp;amp;=[g+f](x)
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Identity element of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;The function &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/57fd87e8444be67269d89328a85967e0f46701fb.svg" style="height: 19px;" type="image/svg+xml"&gt;z(x)=0&lt;/object&gt; serves as an additive identity element:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/dce060496788f0142b23363dfbcc5693ba485c60.svg" style="height: 46px;" type="image/svg+xml"&gt;\[\begin{aligned}
    f(x)+z(x)&amp;amp;=f(x)\qquad \forall x \\
    [f+x](x)&amp;amp;=f(x)
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Inverse elements of vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;We’ll define &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/32ee7e099083d97f16fbc17f4a7e4e937ab6bfde.svg" style="height: 19px;" type="image/svg+xml"&gt;(-f)(x)=-f(x)&lt;/object&gt; as the additive inverse, and
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/a3c3b8ab0293d164dc9264c5b930f99ca5c0ef69.svg" style="height: 19px;" type="image/svg+xml"&gt;z(x)&lt;/object&gt; as before:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/3e5bf7838f049ad4b56b9e7bb36e1bf0c020b39c.svg" style="height: 46px;" type="image/svg+xml"&gt;\[\begin{aligned}
    f(x)-f(x)&amp;amp;=z(x)\qquad \forall x \\
    [f+(-f)](x)&amp;amp;=z(x)
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Associativity of scalar multiplication&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For a scalar &lt;object class="valign-m1" data="https://eli.thegreenplace.net/images/math/41dda438f08a5871357c764fc06872203527a033.svg" style="height: 13px;" type="image/svg+xml"&gt;a\in\mathbb{F}&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/842d27055cdb4728723a76d803c5abb5e8de22c7.svg" style="height: 19px;" type="image/svg+xml"&gt;\[a(bf(x))=ab(f(x))=(ab)f(x)\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Identity element of scalar multiplication&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;We’ll use the scalar 1 as the identity element of scalar multiplication.
Since the result of &lt;img alt="f(x)" class="valign-m4" src="https://eli.thegreenplace.net/images/math/3e03f4706048fbc6c5a252a85d066adf107fcc1f.png" style="height: 18px;" /&gt; is a real or complex scalar, it’s
trivially true that:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/2ed265f75d2703cb3a8d4af7043466f350ee12d3.svg" style="height: 19px;" type="image/svg+xml"&gt;\[1\cdot f(x)=f(x)\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Distributivity of scalar multiplication over vector addition&lt;/strong&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/3c5cbbd8f9c8d1c25ccbbb23efd9a0b12da16ea2.svg" style="height: 72px;" type="image/svg+xml"&gt;\[\begin{aligned}
    a\cdot [f+g](x)&amp;amp;=a\cdot (f(x)+g(x))\\
    &amp;amp;=a\cdot f(x)+a\cdot g(x)\\
    &amp;amp;=[af](x)+[ag](x)
\end{aligned}\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Distributivity of scalar multiplication over scalar addition&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;For scalars &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt; and &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/e9d71f5ee7c92d6dc9e92ffdad17b8bd49418f98.svg" style="height: 13px;" type="image/svg+xml"&gt;b&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/52e70eeb2b306cf0116790411556d9e1e33709ee.svg" style="height: 19px;" type="image/svg+xml"&gt;\[(a+b)\cdot f(x)=a\cdot f(x) + b\cdot f(x)\]&lt;/object&gt;
&lt;/div&gt;
&lt;div class="section" id="appendix-b-proof-that-square-integrable-functions-form-a-subspace"&gt;
&lt;h2&gt;Appendix B: proof that square integrable functions form a subspace&lt;/h2&gt;
&lt;p&gt;To show that &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt; is a subspace of the function vector space, we
have to prove the following properties:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero element&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The zero element &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/57fd87e8444be67269d89328a85967e0f46701fb.svg" style="height: 19px;" type="image/svg+xml"&gt;z(x)=0&lt;/object&gt; is in &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/f1aae083af2c79348dd78712847ebd55537fa6e6.svg" style="height: 15px;" type="image/svg+xml"&gt;L^2&lt;/object&gt;:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/bd171a29d6989fd5d58cee82d37afd8716250588.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}\left | z(x) \right |^2 dx =\int_{-\infty}^{\infty} 0\ dx=0
&amp;lt; \infty\]&lt;/object&gt;
&lt;p&gt;&lt;strong&gt;Closure under addition&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Recall that our functions &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/f91319603c5b4d40a030bde9928e0dad2e302003.svg" style="height: 19px;" type="image/svg+xml"&gt;f,g\in L^2&lt;/object&gt; are complex-valued. For any
two complex numbers (see &lt;a class="reference external" href="https://eli.thegreenplace.net/2024/calculating-the-norm-of-a-complex-number/"&gt;this
post&lt;/a&gt;):&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/d9da217b419977232c85bd09f088a1488f24e6df.svg" style="height: 22px;" type="image/svg+xml"&gt;\[|u+v|^2=|u|^2+|v|^2+2 Re(uv^*)\]&lt;/object&gt;
&lt;p&gt;It’s very easy to show that &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/f0c7cfe3f7c131897d6b5a7c3202a66e1727d0f0.svg" style="height: 19px;" type="image/svg+xml"&gt;2Re(uv^*)\leq2|u||v|&lt;/object&gt;, and also that
&lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/edcc7de7560fc567c6f53c77cd9d0f7dfe9adff3.svg" style="height: 20px;" type="image/svg+xml"&gt;2|u||v|\leq|u|^2+|v|^2&lt;/object&gt;. Therefore:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/6b94e0f70ebeaf4a4a1bb70da66346b8192a8c4e.svg" style="height: 22px;" type="image/svg+xml"&gt;\[|u+v|^2\leq2|u|^2+2|v|^2\]&lt;/object&gt;
&lt;p&gt;Armed with this, let’s check if the sum of &lt;img alt="f(x)" class="valign-m4" src="https://eli.thegreenplace.net/images/math/3e03f4706048fbc6c5a252a85d066adf107fcc1f.png" style="height: 18px;" /&gt; and &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/65405422ff71ebf2db437dbd89a41355f4f19183.svg" style="height: 19px;" type="image/svg+xml"&gt;g(x)&lt;/object&gt;
is square integrable:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/840807083233513ba8f4aa65b29ceffbb126121a.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}\left | f(x) + g(x)\right |^2 dx\]&lt;/object&gt;
&lt;p&gt;Since the values &lt;img alt="f(x)" class="valign-m4" src="https://eli.thegreenplace.net/images/math/3e03f4706048fbc6c5a252a85d066adf107fcc1f.png" style="height: 18px;" /&gt; and &lt;object class="valign-m5" data="https://eli.thegreenplace.net/images/math/65405422ff71ebf2db437dbd89a41355f4f19183.svg" style="height: 19px;" type="image/svg+xml"&gt;g(x)&lt;/object&gt; are just complex numbers,
we’ll use the inequality shown above to write:&lt;/p&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/ed528804ab2db100bebbcfb9e0c3fbdced57d617.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}\left | f(x) + g(x)\right |^2 dx
\leq
2\int_{-\infty}^{\infty}\left | f(x) \right |^2 dx+
2\int_{-\infty}^{\infty}\left | g(x) \right |^2 dx\]&lt;/object&gt;
&lt;p&gt;Both integrals on the right hand side are finite, so the one on the left
is finite as well.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Closure under scalar multiplication&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Given &lt;object class="valign-m4" data="https://eli.thegreenplace.net/images/math/15cb44971b35cc405e5c3f09436de56499476379.svg" style="height: 19px;" type="image/svg+xml"&gt;f\in L^2&lt;/object&gt; and a scalar &lt;object class="valign-0" data="https://eli.thegreenplace.net/images/math/86f7e437faa5a7fce15d1ddcb9eaeaea377667b8.svg" style="height: 8px;" type="image/svg+xml"&gt;a&lt;/object&gt;:&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;object class="align-center" data="https://eli.thegreenplace.net/images/math/581b751f65cf88764a3e410165233ffb376138c0.svg" style="height: 41px;" type="image/svg+xml"&gt;\[\int_{-\infty}^{\infty}a \left | f(x) \right |^2 dx =a^2 \int_{-\infty}^{\infty} \left | f(x) \right |^2\ dx=&amp;lt; \infty\]&lt;/object&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;We want to work with functions that have finite total energy. Note
that this is a pretty strong restriction! In Fourier analysis, we
typically modify the square integrability requirement to be on a
finite interval - all the tools still work - and talk about periodic
functions.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;I’m not including the proofs here, because some of them are a bit
technical and require terminology from real analysis.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;A classical example of a set not satisfying this condition is
&lt;object class="valign-m3" data="https://eli.thegreenplace.net/images/math/242caa40867f427b3d560c541e4b60e878f790b0.svg" style="height: 15px;" type="image/svg+xml"&gt;\mathbb{Q}&lt;/object&gt; - the rational numbers; an infinite sum of
rational numbers can end up being irrational:
&lt;object class="valign-m6" data="https://eli.thegreenplace.net/images/math/babe4374c959e6e1b95dc04b4463a7ba10f94bb2.svg" style="height: 22px;" type="image/svg+xml"&gt;\sum_{n=0}^{\infty}\frac{1}{n!}=e&lt;/object&gt;. Infinite sums are tricky!&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Math"></category></entry></feed>