Convexity and tangent lines

November 29, 2020 4-minute read

Math • Calculus • Convexity

Introduction

In 1991, the famous Russian mathematician V.I. Arnold published his essay “A mathematical trivium” which took issue with the culture of mathematical education at universities in the second half of the 19th century, criticizing the mode of examination that leads to “specialists [that] are not in a position to solve the the simplest problems, and do possess the rudiments of their trade” (p. 272). In order to reverse that trend, he proposes 100 model problems that every student of mathematics should be able to solve. These problems often deal in part with estimation, approximation and computation, seldolmy you would need to prove anything formally; the appropriate application of theorems takes precendence.

For example, problem no. 13 asks to evaluate the integral $\int\limits_{0}^{10} x^x\;\textnormal{d}x$ (for which the indefinite integral does not have any closed-form solution) with 5% relative error. One key idea to solve it is to use the convexity of the integrand $f(x) := x^x$ by considering the relationship of the function $f$ with the tangent lines of any of its points. More precisely, for $I = (0, 10)$ and arbitrary $a, x \in I$ , it holds that

$f (x) \geq f (a) + f^{'} (a) (x - a)$

Though why is this true? Why does the graph of a convex function always lie above the tangent lines of all of its points? In the following, I want to build some geometrical intuition as to why this relationship, at least in the one-dimensional case, is more or less obvious. Note that the statement is also true for higher dimensions. Then you just need to replace the slope $f’(a)$ with the gradient $\nabla f$ .

Basic Idea

The underlying idea about the proof is the link between convexity of the function $f$ and the behavior of its derivatives. When you know that $f$ is convex, you may prove that the derivative $f’$ is monotonically increasing. This is a standard textbook result. Now, the claim that the graph of a convex function always lies above the tangent line of any of its points, then translates to the statement that the graph curves away from the tangent line, positioning the graph of $f$ above it.

To make yourself more familiar with the idea, consider the following TikZ picture.

Proof

We have to show the claim for arbitrary $x \in (a, b)$ and $p \in (a, b)$ . Therefore, take some point $p$ as in the TikZ picture. First, we consider another point $x_2 \in (a, b)$ such that $x_2 > p$ . We know that $f$ is convex. So, because it holds that $f’(p) < f’(x_2)$ since $p < x_2$ and $f’$ monotonically increasing due to its convexity, we have that $f^{'} (x_{2}) \geq \frac{f (x_{2}) - f (p)}{x_{2} - p} \geq f^{'} (p)$ But this immediately implies that $\boxed{f(x_2) \geq f(p) + f’(p)(x_2 - p)}$ and so the claim holds for all $x \in (p, b)$ . Now, what if we consider points $x$ for which $x < p$ , i.e. $x \in (a, p)$ ? Consider now $x_1 \in (a, p)$ . Then, we similarly have that $f^{'} (p) \geq \frac{f (p) - f (x_{1})}{p - x_{1}} \geq f^{'} (x_{1})$ again due to the monotone increase of $f’$ which is due to its convexity. But then $f^{'} (p) \geq \frac{f (p) - f (x_{1})}{p - x_{1}} ⟺ f^{'} (p) \geq \frac{(- 1)}{(- 1)} \frac{f (x_{1}) - f (p)}{x_{1} - p} ⟺ f (p) + f^{'} (p) (x_{1} - p) \leq f (x_{1})$ and voilà, we’re done. Note that in the last step, the inequality sign changed because $x_1 - p < 0$ .

Application to Problem 13

Using the theorem we can infer the lower bound of the integral $\int\limits_{1}^{10} x^x \;\text{d}x$ in the form of

$\begin{aligned} f (x) \geq f (10) + f^{'} (10) (x - 10) & ⟺ x^{x} \geq 10^{10} + 10^{1} 0 (\log (10) + 1) (x - 10) \\ ⟹ \int_{1}^{10} x^{x} d x \geq \int_{1}^{10} 10^{10} + 10^{1} 0 (\log (10) + 1) (x - 10) d x \end{aligned}$

Together with a upper bound on the integral and other approximations, we can then proceed to solve the problem.