Numerical Optimization

There’s a good chance that most (if not all) of the math you’ve learned so far has been analytical, meaning that solutions are found by manipulating symbols according to the many rules of algebra, calculus, geometry, etc. However, this is not the only way to do math. In numerical mathematics, every symbol is given an approximate value, and then the values are manipulated to give an approximate solution. Where analytical analysis typically involves doing a small number of difficult steps to arrive at an exact solution; numerical analysis typically involves doing a huge number of very easy steps to arrive at a good approximate solution. And, believe it or not, sometimes an approximation is more useful than an exact solution.¹

Numerical methods are often used to solve optimization problems, which usually don't have simple analytical solutions. If you've ever tried to be good at anything, you're familiar with the concept of optimization. Sometimes it's simple, like finding the optimal milk-dunking time for a cookie. Sometimes it's extremely complicated, like solving a Rubik's cube in the fewest possible moves. However, you can describe any optimization problem as finding a set of inputs that minimize or maximize some output function through some variation of guess-and-check.

For example, say you’re trying to find the minimum of a function: $$\mathrm{Find}~ k ~\mathrm{such~that}~ f(k)≤f(x) \mathrm{~for~ all~} x$$ Unless you have some prior knowledge, your initial guess $k_0$ might as well be random. Then each subsequent guess is determined as follows: $$k_{n+1}=k_n−af'(k_n).$$ This may not initially look like a guess-and-check method but think about what's going on at each iteration. Each new input (guess) is made by examining something about the previous output (check), in this case by taking a step of size $a$ in the negatively sloped direction.

There are three possible outcomes of this algorithm. If the function is something like $f(x)=x^2$, you’ll sink straight to the bottom, no matter where you start. With each iteration, your guess will change less and less. This is called convergence. If the function is something like $f(x)=−x^2$, your updates will become larger and larger, as the algorithm races down an ever-steepening slope. This is called divergence. If your function is more complicated—which it almost always will be—you will probably end up converging to a local minimum. The best algorithms have systematic ways to avoid getting stuck in these local minima.

Numerical methods often require hundreds or even thousands of iterations before converging and, unless you’re very lucky,² the result will never be the exact solution. So you might wonder, "Why would these methods even exist?" Here’s how I like to think about it. In the toolkit known as "math," analytical methods are the wrenches, screwdrivers, and Allen keys. Each one goes with a specific type of problem. Numerical methods exist for the same reason pliers exist—sometimes you just don’t have the right size wrench. Sure, it might only be an approximation, but that approximation can be very good. Sure, it might be absurdly tedious, but that’s literally what computers are for.³ And most importantly, it can solve problems that don’t have a known analytical solution.

There’s a good thread on Stack Exchange discussing the two methods. The Wikipedia articles on mathematical optimization and numerical analysis also a good source.

Footnotes

To convince yourself of this, try cutting a piece of string that is $\sqrt{75}$ inches long.
We’re talking, like, open-the-dryer-and-all-your-clothes-are-folded lucky.
Alan Turing’s "Bombe," one of the first modern computers, was designed to cycle through the 151 trillion possible settings on the Nazi "Enigma" encryption device.