Newton interpolation and the $n$-th harmonic number

Introduction

Baker (1984), in his well-known introduction to number theory, asks in exercise (iii) of chapter 1 to prove that $1 + \frac{1}{2} + \frac{1}{3} + \ldots + \frac{1}{n}$ is never an integer, for no $n \in \mathbb{N}_{> 1}$. At first glance, this is not entirely clear, in particular as the harmonic series diverges rendering a lot of possible values to be attained.
As it turns out, the proof to this claim is rather challenging considering the material Baker covered in the first chapter. There is an interesting approach, however, that is based on numerical analysis in the form of polynomial interpolation. It utilizes chapter 4, exercises 9 and 10 of Koenigsberger’s ‘Analysis I’ and builds on Newton interpolation.

Newton interpolation

Suppose we have $n + 1$ distinct input data $x_0, x_1, x_2, \ldots, x_n$, e.g. temperature values, and $n + 1$ observed output values $y_0, \ldots, y_n$. We claim a causal or non-causal relationship between the input and output data and further scrutinize the ordered pairs $(x_0, y_0), (x_1, y_1), \ldots, …, (x_n, y_n)$. Loosely speaking, the general idea in polynomial interpolation now is to consider some polynomial $P(x)$ of some degree and change its coefficients in a way that fits the data at hand, i.e. $P(x_0) \approx y_0, P(x_1) \approx y_1, \ldots, P(x_n) \approx y_n$. One advantage now is that we are able to infer non-observed, but plausible input-output-pairs based on the data observed, for example, $P(\frac{x_0 + x_1}{2})$.

In Newton interpolation, we consider the following as the base polynomial whose coefficients are fitted to the data,

\begin{align} P(x) &= c_0 + c_1(x - x_0) + c_2(x - x_0)(x - x_1) + \ldots + c_{n}(x - x_0)(x - x_1)\cdots (x - x_{n - 1}) \\
&= c_0 + \sum\limits_{k = 0}^{n - 1} c_{k + 1}(x - x_0)\cdots(x - x_k) \end{align}

The Newton polynomial incorporates the observed data in its design. Rather than to approximate $y_0, \ldots, y_n$, it demands to change the coefficients $c_0, \ldots, c_n$ in such a way that $P(x_i) = y_i$ for all $i \in \{0, \ldots, n\}$. In other words, the resulting polynomial fits the observed data perfectly. The following proposition proves that for an arbitrary collection of input-output-pairs $(x_0, y_0), \ldots, (x_n, y_n)$ there are always unique coefficients $c_0, \ldots, c_{n}$ that fulfill that demand.

Proposition. There are unique coefficients $c_0, \ldots, c_{n}$ such that the above polynomial $P(x)$ has the property that $P(x_i) = y_i$ for all $i \in \{0, \ldots, n\}$.

Proof.

One conceptual idea to prove the statement is to rewrite it in terms of matrix algebra,

\begin{equation} \begin{pmatrix} P(x_0) \\ \vdots \\ P(x_{n}) \end{pmatrix} = \begin{pmatrix} c_0 \\ c_0 + c_1(x_1 - x_0) \\ \vdots \\ c_0 + \ldots + c_n(x_n - x_0)\cdots (x_n - x_{n - 1}) \end{pmatrix} = \mathbf{Pc} = \begin{pmatrix} y_0 \\ \vdots \\ y_n \end{pmatrix} \end{equation}

where

\begin{equation} \mathbf{P} = \begin{pmatrix} 1 & 0 & \ldots & 0 \\ 1 & (x_1 - x_0) & \ldots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 1 & (x_1 - x_0) & \ldots & (x_n - x_0)(x_n - x_1)\cdots (x_n - x_{n - 1}) \end{pmatrix} \end{equation}

and

\begin{equation} \mathbf{c} = \begin{pmatrix} c_0 \\ \vdots \\ c_n\end{pmatrix} \end{equation}

Since the $x_0, \ldots, x_n$ are distinct by assumption, the diagonal of $P$ doesn’t vanish anywhere. Since the determinant of a triangular matrix is the product of its diagonal elements and the matrix is invertible iff the determinant doesn’t vanish, it follows that $\mathbf{P}$ is invertible and hence there is unique $\mathbf{c}$ that solves the system of equations. $\blacksquare$

From Newton polynomials to integer-valued polynomials

Consider the polynomial $P(x)$ from the section before and set $x_i = i$ for $i \in \{0, \ldots, n\}$ as the input data. Then, we have that

\begin{align} P(x) &= c_0 + \sum\limits_{k = 0}^{n - 1} c_{k + 1}(x - x_0)\cdots (x - x_k) \\
&= c_0 + \sum\limits_{k = 0}^{n - 1} c_{k + 1}x(x - 1)\cdots (x - x_k) \\
&= c_0 + \sum\limits_{k = 1}^{n} c_{k}x(x - 1)\cdots (x - k + 1) \\
&= c_0 + \sum\limits_{k = 1}^{n} c_k\frac{x!}{(x - k)!}\frac{k!}{k!} \\\ &= c_0 + \sum\limits_{k = 1}^{n} c_kk!\binom{x}{k} \\
&= c_0 + \sum\limits_{k = 1}^{n} b_k\binom{x}{k} \\\ &= \sum\limits_{k = 0}^{n} b_k\binom{x}{k} \end{align}

where $b_k = c_kk!$. The representation that will be important in what follows is

\begin{equation} \boxed{P(x) = \sum\limits_{k = 0}^{n} b_k\binom{x}{k}} \qquad (1) \end{equation}

It is posed as an exercise in Konigsberger (1999, chapter 4, ex. 10) to show that some polynomial $S$ of degree $n$ is integer-valued if and only if it has the representation in (1) with $b_0, b_1, \ldots, b_n \in \mathbb{Z}$. Clearly, $\binom{x}{k}$ is for $x \in \mathbb{Z}$ and $k \in \mathbb{N}$ always an integer.

Proof. One direction is obvious as the sum of and products of integers are again integers. Now, suppose we have an polynomial $S$ from which we know that is integer-valued and has the given form. Then, also $P(0)$ is an integer. But clearly, $P(0) = b_0$. Therefore, $b_0$ has to be an integer. The other coefficients, $b_k$ for $k = 1, \ldots, n$ are then recursively determined since $b_1 = P(m) - b_0$ and so on, i.e. $b_m = P(m) - \sum\limits_{k = 0}^{m - 1} b_k\binom{z}{m}$. Since $b_0$ is an integer and $P(m)$ is an integer by assumption, we have the converse statement that $b_k \in \mathbb{Z}$ for all $k \in \{1, \ldots, n\}$.

The $n$-th harmonic number

After introducing Newton interpolation, let us go back to the original problem of showing that

\begin{equation} H_n = 1 + \frac{1}{2} + \frac{1}{3} + \ldots + \frac{1}{n} \end{equation}

is never an integer. In mathematics, $H_n$ is called the $n$-th harmonic number and credited to Euler who analyzed it extensively.

It is very easy to show that for $x \in \mathbb{R}$ and $n \in \mathbb{N}$ it holds that

\begin{equation} 1 + x + x^2 + \ldots + x^n = \frac{1 - x^{n + 1}}{1 - x} \end{equation}

By simple integration rules we have that \begin{align} 1 &= \int\limits_{0}^{1} 1 \;\text{d}x \\
\frac{1}{2} &= \int\limits_{0}^{1} x \;\text{d}x \\\ \frac{1}{3} &= \int\limits_{0}^{1} x^2 \;\text{d}x \\\ &\vdots
\end{align}

and therefore

\begin{equation} H_n = 1 + \frac{1}{2} + \ldots + \frac{1}{n} = \int\limits_{0}^{1} 1 + x + x^{2} + \ldots + x^{n - 1}\;\text{d}x = \int\limits_{0}^{1} \frac{1 - x^n}{1 - x} \;\text{d}x \end{equation}

Now set the substitution $x = 1 - u$. Then, we have that $\frac{\text{d}x}{\text{d}u} = -1$. Also, the upper limit transforms to $u(1) = 1 - 1 = 0$ and the lower limit to $u(0) = 1 - 0 = 1$. Therefore, we get

\begin{equation} H_n = \int\limits_{0}^{1} \frac{1 - x^n}{1 - x} \;\text{d}x = -\int\limits_{1}^{0} \frac{1 - (1 - u)^n}{u} \;\text{d}u = \int\limits_{0}^{1} \frac{1 - (1 - u)^n}{u} \;\text{d}u \end{equation}

The integrand can be expanded into a binomial since $(1 - u)^{n} = \sum\limits_{k = 0}^{n} \binom{n}{k}(-1)^{k}u^{k}$. Therefore, we arrive at

\begin{equation} \boxed{H_n = \int\limits_{0}^{1} \Big( \sum\limits_{k = 1}^{n} \binom{n}{k}(-1)^{k - 1}u^{k -1}\Big)\;\text{d}u = \sum\limits_{k = 1}^{n} (-1)^{k - 1}\frac{1}{k}\binom{n}{k}} \qquad (2) \end{equation}

Solution to the initial problem

Consider now $H_n$ in (2) as a polynomial of $n$, i.e. $H(n)$ and write $b_k = (-1)^{k - 1}\frac{1}{k}$ as in (1). Then we know from section 2 and 3 that this polynomial can only be integer-valued iff all the $b_k$ for $k \in \{1, \ldots, n\}$ are themselves integers. But clearly, they are not, for no integer $n$ greater 1. Hence, $H_n$ is never an integer, for no $n \in \mathbb{N}_{> 1}$. $\blacksquare$

References

[1] Alan Baker (1984): “A Concise Introduction to Number Theory”. Cambridge University Press.

[2] Konigsberger (1999): “Analysis I”. Springer Verlag.