# Lagrange polynomial

In numerical analysis, the Lagrange interpolating polynomial is the unique polynomial of lowest degree that interpolates a given set of data.

This image shows, for four points ((−9, 5), (−4, 2), (−1, −2), (7, 9)), the (cubic) interpolation polynomial L(x) (dashed, black), which is the sum of the scaled basis polynomials y00(x), y11(x), y22(x) and y33(x). The interpolation polynomial passes through all four control points, and each scaled basis polynomial passes through its respective control point and is 0 where x corresponds to the other three control points.

Given a data set of coordinate pairs ${\displaystyle (x_{j},y_{j})}$ with ${\displaystyle 0\leq j\leq k,}$ the ${\displaystyle x_{j}}$ are called nodes and the ${\displaystyle y_{j}}$ are called values. The Lagrange polynomial ${\displaystyle L(x)}$ has degree ${\textstyle \leq k}$ and assumes each value at the corresponding node, ${\displaystyle L(x_{j})=y_{j}.}$

Although named after Joseph-Louis Lagrange, who published it in 1795, the method was first discovered in 1779 by Edward Waring.[1] It is also an easy consequence of a formula published in 1783 by Leonhard Euler.[2]

Uses of Lagrange polynomials include the Newton–Cotes method of numerical integration and Shamir's secret sharing scheme in cryptography.

For equispaced nodes, Lagrange interpolation is susceptible to Runge's phenomenon of large oscillation.

## Definition

Given a set of ${\textstyle k+1}$  nodes ${\displaystyle \{x_{0},x_{1},\ldots ,x_{k}\}}$ , which must all be distinct, ${\displaystyle x_{j}\neq x_{m}}$  for indices ${\displaystyle j\neq m}$ , the Lagrange basis for polynomials of degree ${\textstyle \leq k}$  for those nodes is the set of polynomials ${\textstyle \{\ell _{0}(x),\ell _{1}(x),\ldots ,\ell _{k}(x)\}}$  each of degree ${\textstyle k}$  which take values ${\textstyle \ell _{j}(x_{m})=0}$  if ${\textstyle m\neq j}$  and ${\textstyle \ell _{j}(x_{j})=1}$ . Using the Kronecker delta this can be written ${\textstyle \ell _{j}(x_{m})=\delta _{jm}.}$  Each basis polynomial can be explicitly described by the product:

{\displaystyle {\begin{aligned}\ell _{j}(x)&={\frac {(x-x_{0})}{(x_{j}-x_{0})}}\cdots {\frac {(x-x_{j-1})}{(x_{j}-x_{j-1})}}{\frac {(x-x_{j+1})}{(x_{j}-x_{j+1})}}\cdots {\frac {(x-x_{k})}{(x_{j}-x_{k})}}\\[10mu]&=\prod _{\begin{smallmatrix}0\leq m\leq k\\m\neq j\end{smallmatrix}}{\frac {x-x_{m}}{x_{j}-x_{m}}}.\end{aligned}}}

Notice that the numerator ${\textstyle \prod _{m\neq j}(x-x_{m})}$  has ${\textstyle k}$  roots at the nodes ${\textstyle \{x_{m}\}_{m\neq j}}$  while the denominator ${\textstyle \prod _{m\neq j}(x_{j}-x_{m})}$  scales the resulting polynomial so that ${\textstyle \ell _{j}(x_{j})=1.}$

The Lagrange interpolating polynomial for those nodes through the corresponding values ${\displaystyle \{y_{0},y_{1},\ldots ,y_{k}\}}$  is the linear combination:

${\displaystyle L(x)=\sum _{j=0}^{k}y_{j}\ell _{j}(x).}$

Each basis polynomial has degree ${\textstyle k}$ , so the sum ${\textstyle L(x)}$  has degree ${\textstyle \leq k}$ , and it interpolates the data because ${\textstyle L(x_{m})=\sum _{j=0}^{k}y_{j}\ell _{j}(x_{m})=\sum _{j=0}^{k}y_{j}\delta _{mj}=y_{m}.}$

The interpolating polynomial is unique. Proof: assume the polynomial ${\textstyle M(x)}$  of degree ${\textstyle \leq k}$  interpolates the data. Then the difference ${\textstyle M(x)-L(x)}$  is zero at ${\textstyle k+1}$  distinct nodes ${\textstyle \{x_{0},x_{1},\ldots ,x_{k}\}.}$  But the only polynomial of degree ${\textstyle \leq k}$  with more than ${\textstyle k}$  roots is the constant zero function, so ${\textstyle M(x)-L(x)=0,}$  or ${\textstyle M(x)=L(x).}$

## Barycentric form

Each Lagrange basis polynomial ${\textstyle \ell _{j}(x)}$  can be rewritten as the product of three parts, a function ${\textstyle \ell (x)=\prod _{m}(x-x_{m})}$  common to every basis polynomial, a node-specific constant ${\textstyle w_{j}=\prod _{m\neq j}(x_{j}-x_{m})^{-1}}$  (called the barycentric weight), and a part representing the displacement from ${\textstyle x_{j}}$  to ${\textstyle x}$ :[3]

${\displaystyle \ell _{j}(x)=\ell (x){\dfrac {w_{j}}{x-x_{j}}}}$

By factoring ${\textstyle \ell (x)}$  out from the sum, we can write the Lagrange polynomial in the so-called first barycentric form:

${\displaystyle L(x)=\ell (x)\sum _{j=0}^{k}{\frac {w_{j}}{x-x_{j}}}y_{j}.}$

If the weights ${\displaystyle w_{j}}$  have been pre-computed, this requires only ${\displaystyle {\mathcal {O}}(k)}$  operations compared to ${\displaystyle {\mathcal {O}}(k^{2})}$  for evaluating each Lagrange basis polynomial ${\displaystyle \ell _{j}(x)}$  individually.

The barycentric interpolation formula can also easily be updated to incorporate a new node ${\displaystyle x_{k+1}}$  by dividing each of the ${\displaystyle w_{j}}$ , ${\displaystyle j=0\dots k}$  by ${\displaystyle (x_{j}-x_{k+1})}$  and constructing the new ${\displaystyle w_{k+1}}$  as above.

For any ${\textstyle x,}$  ${\textstyle \sum _{j=0}^{k}\ell _{j}(x)=1}$  because the constant function ${\textstyle g(x)=1}$  is the unique polynomial of degree ${\displaystyle \leq k}$  interpolating the data ${\textstyle \{(x_{0},1),(x_{1},1),\ldots ,(x_{k},1)\}.}$  We can thus further simplify the barycentric formula by dividing ${\displaystyle L(x)=L(x)/g(x)\colon }$

{\displaystyle {\begin{aligned}L(x)&=\ell (x)\sum _{j=0}^{k}{\frac {w_{j}}{x-x_{j}}}y_{j}{\Bigg /}\ell (x)\sum _{j=0}^{k}{\frac {w_{j}}{x-x_{j}}}\\[10mu]&=\sum _{j=0}^{k}{\frac {w_{j}}{x-x_{j}}}y_{j}{\Bigg /}\sum _{j=0}^{k}{\frac {w_{j}}{x-x_{j}}}.\end{aligned}}}

This is called the second form or true form of the barycentric interpolation formula.

This second form has advantages in computation cost and accuracy: it avoids evaluation of ${\displaystyle \ell (x)}$ ; the work to compute each term in the denominator ${\displaystyle w_{j}/(x-x_{j})}$  has already been done in computing ${\displaystyle {\bigl (}w_{j}/(x-x_{j}){\bigr )}y_{j}}$  and so computing the sum in the denominator costs only ${\textstyle k-1}$  addition operations; for evaluation points ${\textstyle x}$  which are close to one of the nodes ${\textstyle x_{j}}$ , catastrophic cancelation would ordinarily be a problem for the value ${\textstyle (x-x_{j})}$ , however this quantity appears in both numerator and denominator and the two cancel leaving good relative accuracy in the final result.

Using this formula to evaluate ${\displaystyle L(x)}$  at one of the nodes ${\displaystyle x_{j}}$  will result in the indeterminate ${\displaystyle \infty y_{j}/\infty }$ ; computer implementations must replace such results by ${\displaystyle L(x_{j})=y_{j}}$

## A perspective from linear algebra

Solving an interpolation problem leads to a problem in linear algebra amounting to inversion of a matrix. Using a standard monomial basis for our interpolation polynomial ${\textstyle L(x)=\sum _{j=0}^{k}x^{j}m_{j}}$ , we must invert the Vandermonde matrix ${\displaystyle (x_{i})^{j}}$  to solve ${\displaystyle L(x_{i})=y_{i}}$  for the coefficients ${\displaystyle m_{j}}$  of ${\displaystyle L(x)}$ . By choosing a better basis, the Lagrange basis, ${\textstyle L(x)=\sum _{j=0}^{k}l_{j}(x)y_{j}}$ , we merely get the identity matrix, ${\displaystyle \delta _{ij}}$ , which is its own inverse: the Lagrange basis automatically inverts the analog of the Vandermonde matrix.

This construction is analogous to the Chinese remainder theorem. Instead of checking for remainders of integers modulo prime numbers, we are checking for remainders of polynomials when divided by linears.

Furthermore, when the order is large, Fast Fourier transformation can be used to solve for the coefficients of the interpolated polynomial.

## Examples

### Example 1

We wish to interpolate ƒ(x) = x2 over the domain 1 ≤ x ≤ 3, given these three points:

{\displaystyle {\begin{aligned}x_{0}&=1&&&f(x_{0})&=1\\x_{1}&=2&&&f(x_{1})&=4\\x_{2}&=3&&&f(x_{2})&=9.\end{aligned}}}

The interpolating polynomial is:

{\displaystyle {\begin{aligned}L(x)&={1}\cdot {x-2 \over 1-2}\cdot {x-3 \over 1-3}+{4}\cdot {x-1 \over 2-1}\cdot {x-3 \over 2-3}+{9}\cdot {x-1 \over 3-1}\cdot {x-2 \over 3-2}\\[10pt]&=x^{2}.\end{aligned}}}

### Example 2

We wish to interpolate ƒ(x) = x3 over the domain 1 ≤ x ≤ 4, given these four points:

 ${\displaystyle x_{0}=1}$ ${\displaystyle f(x_{0})=1}$ ${\displaystyle x_{1}=2}$ ${\displaystyle f(x_{1})=8}$ ${\displaystyle x_{2}=3}$ ${\displaystyle f(x_{2})=27}$ ${\displaystyle x_{3}=4}$ ${\displaystyle f(x_{3})=64}$

The interpolating polynomial is:

{\displaystyle {\begin{aligned}L(x)&={1}\cdot {x-2 \over 1-2}\cdot {x-3 \over 1-3}\cdot {x-4 \over 1-4}+{8}\cdot {x-1 \over 2-1}\cdot {x-3 \over 2-3}\cdot {x-4 \over 2-4}+{27}\cdot {x-1 \over 3-1}\cdot {x-2 \over 3-2}\cdot {x-4 \over 3-4}+{64}\cdot {x-1 \over 4-1}\cdot {x-2 \over 4-2}\cdot {x-3 \over 4-3}\\[8pt]&=x^{3}\end{aligned}}}

### Notes

Example of interpolation divergence for a set of Lagrange polynomials.

The Lagrange form of the interpolation polynomial shows the linear character of polynomial interpolation and the uniqueness of the interpolation polynomial. Therefore, it is preferred in proofs and theoretical arguments. Uniqueness can also be seen from the invertibility of the Vandermonde matrix, due to the non-vanishing of the Vandermonde determinant.

But, as can be seen from the construction, each time a node xk changes, all Lagrange basis polynomials have to be recalculated. A better form of the interpolation polynomial for practical (or computational) purposes is the barycentric form of the Lagrange interpolation (see below) or Newton polynomials.

Lagrange and other interpolation at equally spaced points, as in the example above, yield a polynomial oscillating above and below the true function. This behaviour tends to grow with the number of points, leading to a divergence known as Runge's phenomenon; the problem may be eliminated by choosing interpolation points at Chebyshev nodes.[4]

The Lagrange basis polynomials can be used in numerical integration to derive the Newton–Cotes formulas.

## Remainder in Lagrange interpolation formula

When interpolating a given function f by a polynomial of degree k at the nodes ${\displaystyle x_{0},...,x_{k}}$  we get the remainder ${\displaystyle R(x)=f(x)-L(x)}$  which can be expressed as[5]

${\displaystyle R(x)=f[x_{0},\ldots ,x_{k},x]\ell (x)=\ell (x){\frac {f^{(k+1)}(\xi )}{(k+1)!}},\quad \quad x_{0}<\xi

where ${\displaystyle f[x_{0},\ldots ,x_{k},x]}$  is the notation for divided differences. Alternatively, the remainder can be expressed as a contour integral in complex domain as

${\displaystyle R(x)={\frac {\ell (x)}{2\pi i}}\int _{C}{\frac {f(t)}{(t-x)(t-x_{0})\cdots (t-x_{k})}}dt={\frac {\ell (x)}{2\pi i}}\int _{C}{\frac {f(t)}{(t-x)\ell (t)}}dt.}$

The remainder can be bound as

${\displaystyle |R(x)|\leq {\frac {(x_{k}-x_{0})^{k+1}}{(k+1)!}}\max _{x_{0}\leq \xi \leq x_{k}}|f^{(k+1)}(\xi )|.}$

### Derivation[6]

Clearly, ${\displaystyle R(x)}$  is zero at nodes. To find ${\displaystyle R(x)}$  at a point ${\displaystyle x_{p}}$ , define a new function ${\displaystyle F(x)=R(x)-{\tilde {R}}(x)=f(x)-L(x)-{\tilde {R}}(x)}$  and choose ${\textstyle {\tilde {R}}(x)=C\cdot \prod _{i=0}^{k}(x-x_{i})}$  where ${\displaystyle C}$  is the constant we are required to determine for a given ${\displaystyle x_{p}}$ . We choose ${\displaystyle C}$  so that ${\displaystyle F(x)}$  has ${\displaystyle k+2}$  zeroes (at all nodes and ${\displaystyle x_{p}}$ ) between ${\displaystyle x_{0}}$  and ${\displaystyle x_{k}}$  (including endpoints). Assuming that ${\displaystyle f(x)}$  is ${\displaystyle k+1}$ -times differentiable, since ${\displaystyle L(x)}$  and ${\displaystyle {\tilde {R}}(x)}$  are polynomials, and therefore, are infinitely differentiable, ${\displaystyle F(x)}$  will be ${\displaystyle k+1}$ -times differentiable. By Rolle's theorem, ${\displaystyle F^{(1)}(x)}$  has ${\displaystyle k+1}$  zeroes, ${\displaystyle F^{(2)}(x)}$  has ${\displaystyle k}$  zeroes... ${\displaystyle F^{(k+1)}}$  has 1 zero, say ${\displaystyle \xi ,\,x_{0}<\xi  . Explicitly writing ${\displaystyle F^{(k+1)}(\xi )}$ :

${\displaystyle F^{(k+1)}(\xi )=f^{(k+1)}(\xi )-L^{(k+1)}(\xi )-{\tilde {R}}^{(k+1)}(\xi )}$
${\displaystyle L^{(k+1)}=0,{\tilde {R}}^{(k+1)}=C\cdot (k+1)!}$  (Because the highest power of ${\displaystyle x}$  in ${\displaystyle {\tilde {R}}(x)}$  is ${\displaystyle k+1}$ )
${\displaystyle 0=f^{(k+1)}(\xi )-C\cdot (k+1)!}$

The equation can be rearranged as

${\displaystyle C={\frac {f^{(k+1)}(\xi )}{(k+1)!}}}$

Since ${\displaystyle F(x_{p})=0}$  we have ${\displaystyle R(x_{p})={\tilde {R}}(x_{p})={\frac {f^{k+1}(\xi )}{(k+1)!}}\prod _{i=0}^{k}(x_{p}-x_{i})}$

## Derivatives

The ${\displaystyle d}$ th derivatives of the Lagrange polynomial can be written as

${\displaystyle L^{(d)}(x):=\sum _{j=0}^{k}y_{j}\ell _{j}^{(d)}(x)}$ .

For the first derivative, the coefficients are given by

${\displaystyle \ell _{j}^{(1)}(x):=\sum _{\begin{smallmatrix}i=0\\i\not =j\end{smallmatrix}}^{k}\left[{\frac {1}{x_{j}-x_{i}}}\prod _{\begin{smallmatrix}m=0\\m\not =(i,j)\end{smallmatrix}}^{k}{\frac {x-x_{m}}{x_{j}-x_{m}}}\right]}$

and for the second derivative

${\displaystyle \ell _{j}^{(2)}(x):=\sum _{\begin{smallmatrix}i=0\\i\neq j\end{smallmatrix}}^{k}{\frac {1}{x_{j}-x_{i}}}\left[\sum _{\begin{smallmatrix}m=0\\m\neq (i,j)\end{smallmatrix}}^{k}\left({\frac {1}{x_{j}-x_{m}}}\prod _{\begin{smallmatrix}l=0\\l\neq (i,j,m)\end{smallmatrix}}^{k}{\frac {x-x_{l}}{x_{j}-x_{l}}}\right)\right]}$ .

Through recursion, one can compute formulas for higher derivatives.

## Finite fields

The Lagrange polynomial can also be computed in finite fields. This has applications in cryptography, such as in Shamir's Secret Sharing scheme.