My Top 100* Theorems in Mathematics

(This is Yan Sheng.)

Exercise: What are the greatest results in math, ever? List 5 of them (or 10, or 20, or...) that you think should be on any reasonable compilation of all-time greatest theorems.

Bonus activity: Give yourself 1 point for each theorem in your list that's also in my list.

At some point I came across Paul and Jack Abad's 1999 list of The Hundred Greatest Theorems. Their list starts out being pretty reasonable, but becomes more and more questionable as it goes on (the law of cosines? the factor and remainder theorems? the divisibility by 3 rule??). Feeling triggered by that list, I ventured to compile my own list of greatest theorems; to keep it comparable to their list, I only included theorems proved before 2000. I wanted to make a "top 100" list, but I could only get just over 80 before I felt like I ran out of ideas (and energy) XD

Since I'm not the Fields Medal Committee (and even if I were), I make no claim to objectivity. Indeed, this list reveals much more about my math education and exposure than the 'intrinsic worth' of the theorems (whatever that means). My criteria for selection are roughly: historical significance of the result; connections to different areas of math; relevance to current research; but most importantly, whether I've heard of it before.

Perhaps unconventionally, dates generally refer to discovery, not publishing. "/" indicates independent discovery, while "," indicates building on top of previous work.


Antiquity

Thales's theorem (~600 BC)
"If $C$ is a point on a semicircle with diameter $AB$, then $\angle ACB$ is a right angle."

  • Earliest mathematical theorem attributed to a person
  • According to legend, Thales sacrificed an ox for the occasion of proving this result

Pythagoras's theorem (~500 BC)
"In a right triangle, the square of the hypotenuse is equal to the sum of the squares of the other sides."

  • Theorem with the largest number of known proofs
  • Known to many cultures in antiquity, eg. Mesopotamia, India, China
  • According to legend, Pythagoras sacrificed an ox for the occasion of proving this result

Irrationality of $\sqrt2$ (~500 BC)
"There does not exist two integers whose ratio has square equal to 2."

  • Throwing a wrench in the Pythagorean doctrine that 'all is [rational] number'
  • According to legend, Pythagoras sacrificed Hippasus for the occasion of proving this result

Classification of Platonic solids (Theaetetus ~400 BC)
"There are exactly 5 regular convex polyhedra."

  • Construction is non-trivial, eg. why does the dodecahedron close up properly?
  • The final book of Euclid's Elements was devoted to proving this

Fundamental theorem of arithmetic (Euclid ~300 BC, al-Farisi ~1300, Gauss 1801)
"Every integer greater than 1 can be uniquely written as a product of primes."

  • Key lemmas were present in Euclid's Elements
  • First clearly stated and proved in Gauss's Disquisitiones Arithmeticae

Area of a circle (Archimedes ~250 BC)
"The area of a circle is equal to the area of a right triangle with the radius as height and the circumference as base."

  • Archimedes's method of exhaustion anticipates infinitesimal calculus
    • "The area of a parabolic segment is 2/3 the area of its bounding parallelogram"
    • "$3+\frac{10}{71}<\pi<3+\frac17$," using regular 96-gons
    • "The volume of a sphere is 2/3 the volume of its circumscribing cylinder;" according to legend, the diagram was engraved on his tombstone.

Quadratic formula (various, eg. Brahmagupta 628, al-Khwarizmi ~825)
"Solving the equation $ax^2+bx+c=0$"

  • Completing the square was known to Babylonians (~1700 BC)
  • Brahmagupta gave the first formula in words, equivalent to: "The solution to $ax^2+bx=c$ is $x=\frac{\sqrt{b^2+4ac}-b}{2a}$"
  • al-Khwarizmi first considered all cases ($ax^2=bx$, $ax^2=c$, $bx=c$, $ax^2+bx=c$, $ax^2+c=bx$, $bx+c=ax^2$)
  • This was before the invention of zero and negative numbers
  • Modern form "$x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}$" first given by Heaton (1896)

16th Century

Solution of the cubic (Tartaglia 1530)
"Solving the equation $x^3+ax^2+bx+c=0$"

  • First derived for a cubic-solving contest (!)
  • Tartaglia divulged his method to Cardano in 1539, but died before he could publish it himself
  • Cardano's student Ferrari used this to solve the quartic equation in 1540
  • Led to the development of the theory of complex numbers (Cardano 1545, Bombelli 1572)

17th Century

Logarithms (Napier 1614)
"There is a function $\log:\mathbb R^+\to\mathbb R$ such that $\log(ab)=\log(a)+\log(b)$."

  • Reduces multiplication to addition in hand calculations (using tables of logarithms and antilogarithms)
  • Napier's first table took 20 years to compute (!)
  • The idea of logarithms as exponents of a fixed base came much later (Euler ~1730)
  • Slide rules operate on the same principle, and were widely used calculational aids into the 1970s

Fermat's right triangle theorem (<1665)
"The area of a right triangle with integer sides cannot be a square."

  • Proof discovered posthumously in his copy of Diophantus's Arithmetica (along with his Last Theorem)
  • Only known complete proof by Fermat (!)
  • Equivalent to "the difference of two fourth powers cannot be a positive square," which implies the $n=4$ case of Fermat's last theorem
  • Illustrates the method of infinite descent, which he extensively developed

Fundamental theorem of calculus (various, eg. Newton 1665-66, Leibniz 1675)
"Differentiation and integration are inverse to each other."

  • cf. Generalised Stokes's theorem

Taylor's theorem (Gregory 1671/Taylor 1715)
"If $f:\mathbb R\to\mathbb R$ is $n$-times differentiable at $x$, then $$f(x+h)=a_0+a_1h+\cdots+a_nh^n+g(h)h^n,$$where $a_k=f^{(k)}(x)/k!$ (here $f^{(k)}$ denotes the $k$th derivative of $f$), and $g(h)\to0$ as $x\to0$."

  • Generalises best linear approximation to $f(x+h)$ (namely $f(x)+f'(x)h$)
  • Previously known special cases:
    • trigonometric functions (Parameshvara ~1400)
    • logarithm (Mercator 1668)
    • binomial series (Newton 1670)
  • Remainder term only obtained later (eg. Lagrange 1797, Cauchy 1825)

Brachistochrone problem (Bernoulli/Bernoulli/Leibniz/Newton/de l'Hôpital 1697)
"The curve of fastest descent between two points is the cycloid."

  • Posed by Galileo in 1638, where he (incorrectly) derived the answer as the arc of a circle; posed again by Johann Bernoulli in 1696
  • Solved by Newton in one night; he published this anonymously, but Bernoulli 'recognised a lion from his claw mark'
  • The ensuing dispute between brothers Johann and Jakob Bernoulli sowed the seeds for the calculus of variations
  • Led to the development of the Euler–Lagrange equations (Euler 1744, Lagrange 1760)

18th Century

Königsberg bridges problem (Euler 1735)
"A graph contains a path passing through each edge exactly once if and only if it has exactly 0 or 2 vertices of odd degree."

  • Pioneering result in graph theory; first use of the concept of a graph
  • Proof of 'if' direction first given by Hierholzer (<1871)

Euler product for the zeta function (Euler 1737)
"The sum of $n^{-s}$ over the positive integers is equal to the product $1/(1-p^{-s})$ over the primes."

  • Corollary: "The sum of $1/p$ over the primes is divergent."
  • Precursor to analytic number theory
  • cf. Dirichlet's theorem on arithmetic progressions, Prime number theorem

Bayes's theorem (Bayes 1763/Laplace 1774)
"$\mathbb P(A|B)=\frac{\mathbb P(B|A)\mathbb P(A)}{\mathbb P(B)}$."

  • Introduced concept of conditional probability
  • Used in Bayesian inference: update probability of hypothesis given state of knowledge
  • Laplace also developed the Bayesian interpretation of probability (What is probability? Baby don't hurt me)

Quadratic reciprocity (Gauss 1796)
"Let $p$ and $q$ be distinct odd primes, and consider the equations$$x^2\equiv p\pmod q,\quad y^2\equiv q\pmod p.$$If at least one of $p,q$ is $\equiv1\pmod4$, then either both equations have solutions or both do not. If $p\equiv q\equiv3\pmod4$, then exactly one of the two equations has a solution."

  • Theorem with second largest number of known proofs, after Pythagoras's theorem (Gauss himself had 8)
  • Attempts to generalise this result motivated developments in algebraic number theory

19th Century

Fundamental theorem of algebra (Argand 1806)
"Every non-constant polynomial with complex coefficients has at least one complex root."

  • Corollary: "Every non-constant polynomial of degree $n$ with complex coefficients has exactly n complex roots (counted with multiplicity)."
  • Gauss's 1799 thesis contains a proof attempt, but assumes a topological result

Fourier series (1807)
"Any $2\pi$-periodic function $f:\mathbb R\to\mathbb R$ can be represented as$$f(x)\sim\frac{a_0}2+\sum_{n=1}^\infty\bigg[a_n\cos(nx)+b_n\sin(nx)\bigg],$$where $a_n$ is twice the average value of $f(x)\cos(nx)$, and $b_n$ is twice the average value of $f(x)\sin(nx)$."

  • ie. Periodic functions can be decomposed as a sum of sines and cosines
  • Fourier used this to solve the boundary value problem for the heat equation
  • Analogue for continuous frequency gives the Fourier transform
  • Equivalently: "f can be expressed as $$f(x)\sim\cdots+f_{-1}e^{-ix}+f_0+f_1e^{ix}+\cdots,$$ where $f_n$ is the average value of $f(x)e^{-inx}$."
  • Developments in analysis guided by search for sufficient conditions for the convergence of Fourier series:
    • Riesz–Fischer (1907): If $f\in L^2$, then the Fourier series converges in $L^2$ norm.
    • Carleson (1966), Hunt (1968): If $f\in L^p$, then the Fourier series converges pointwise almost everywhere.
  • Modern viewpoint: eigenfunction decomposition for the Laplace operator

Insolubility of the quintic (Ruffini 1799, Abel 1824)
"There is no general solution in radicals to polynomial equations of degree at least 5."

  • The end of an era in algebra, which was concerned with solving polynomial equations (hence the name 'Fundamental theorem of algebra')
  • cf. Fundamental theorem of Galois theory

Cauchy integral theorem (Cauchy 1825, Goursat 1900)
"For any holomorphic function defined on a simply-connected region in $\mathbb C$, the contour integral along any closed rectifiable curve is equal to 0."

  • Holomorphic: complex differentiable
  • Corollaries:
    • "Every holomorphic function is infinitely differentiable."
    • "Every holomorphic function is complex analytic, ie. is equal to its power series about an interior point of its domain."
    • Cauchy integral formula (in particular, "The values of a holomorphic function inside a disk is determined by its values on the boundary.")
    • Cauchy residue theorem
  • Fundamental result in complex analysis

Theorema Egregium (Gauss 1827)
"The Gaussian curvature of a surface is invariant under local isometries."

  • Latin for 'Remarkable Theorem'
  • Gaussian curvature: product of maximum and minimum curvature at a point on a surface
  • ie. Gaussian curvature is an intrinsic property of a surface, independent of its embedding into the ambient space
  • Corollary: "There is no map projection preserving all distances."
  • Fundamental result in differential geometry

Fundamental theorem of Galois theory (1830)
"The intermediate fields of a finite Galois field extension correspond to subgroups of its Galois group."

  • Abel–Ruffini left open the question of characterising equations which admit solutions by radicals
  • That was solved in spectacular fashion by Galois, pioneering methods in group theory
  • Galois then promptly died from a duel, at age 20, for a girl who didn't even like him (What is love? Baby don't hurt me)
  • Modern reformulation, in terms of field extensions instead of roots of polynomials, due to Weber (1895) and Artin (1942)
  • cf. Insolubility of the quintic

Independence of the parallel postulate (Gauss ~1817/Bolyai 1823/Lobachevsky 1826, Beltrami 1868)
"The parallel postulate cannot be proven from the other axioms of Euclidean geometry."

  • Parallel postulate: given a line and a point not on it, there exists exactly one line through the given point parallel to the given line; equivalent to Euclid's fifth axiom
  • Thought to be less self-evident than Euclid's other axioms, so many attempts were made to prove it
  • Saccheri (1733) proved many theorems assuming that the parallel postulate is false, but thought they were 'repugnant to the nature of straight lines'
  • The idea of a consistent non-Euclidean geometry (ie. without the parallel axiom) was first conceived by Gauss, but never published
  • First published independently by Lobachevsky (1829) and Bolyai (1832)
  • Beltrami constructed an Euclidean model of hyperbolic geometry, showing that Euclidean and non-Euclidean geometry have equal logical validity
  • Effect of this paradigm shift was felt in philosophy, physics, and theology, in re-evaluating the status of Euclidean geometry as absolute truth

Dirichlet's theorem on arithmetic progressions (1837)
"If $a$ and $d$ are coprime integers, then there are infinitely many primes congruent to $a$ (mod $d$)."

  • Proof by investigating properties of Dirichlet L-functions, a generalisation of the zeta function
  • Pioneering result in analytic number theory: using methods of analysis to solve number theory problems
  • cf. Euler product for the zeta function

Riemann–Roch theorem (Riemann 1857, Roch 1865)
"Calculating the dimension of the space of meromorphic functions on a Riemann surface with prescribed zeros and poles"

  • Spawned many generalisations, eg. to algebraic curves (Schmidt 1931), algebraic varieties (Hirzebruch 1954), coherent cohomology (Grothendieck 1957), differential topology (Atiyah–Singer 1963)
  • cf. Atiyah–Singer index theorem

Unique factorisation of ideals (Kummer 1847, Dedekind 1871)
"Every nonzero proper ideal in the ring of integers of a number field admits a unique factorisation into a product of nonzero prime ideals."

  • Motivated by Fermat's last theorem, Kummer introduced 'ideal numbers' to account for failure of unique factorisation in rings of integers, and proved the result for cyclotomic fields
  • cf. Fermat's last theorem

Weierstrass function (Weierstrass 1872)
"The Weierstrass function is continuous but nowhere differentiable."

  • The concept of a function had shifted in the past century, from 'analytic expression' to 'continuous function' to the modern definition
  • This result contradicted the widely held belief that continuous functions were differentiable at almost all points
  • Weierstrass is widely regarded as the father of modern analysis, placing it on rigorous foundations

Lindemann–Weierstrass theorem (Lindemann 1882, Weierstrass 1885)
"If $a_1,\ldots,a_n$ are algebraic numbers which are linearly independent over $\mathbb Q$, then $e^{a_1},\ldots,e^{a_n}$ are algebraically independent over $\mathbb Q$."

  • Corollary: "If $a$ is algebraic, then $e^a$ is transcendental."
  • Corollary: "$\pi$ is transcendental."
  • Many hard problems in transcendental number theory, eg. Is $\pi+e$ irrational?

Jordan curve theorem (1887)
"Any simple closed curve in the plane divides it into exactly two connected regions."

  • Dramatic example of an intuitively obvious statement which is hard to prove
  • Main difficulty lies in pathological examples, eg. nowhere-differentiable curves (Weierstrass function, 1872; Koch curve, 1904), curves with positive area (Osgood curves, 1903)
  • Demonstrates the rise of rigour in late 19th century development of analysis

Cantor's theorem (1891)
"There is no surjection from any set to its power set."

  • ie. there are 'different sizes' of infinity
  • Corollary: "The set of real numbers is uncountable" (Cantor 1874)
  • Short proof using the now-famous diagonal argument
  • Leads to Cantor's paradox (~1896): the collection of all sets is not a set; cf. Russell's paradox

Arzelà–Ascoli theorem (Ascoli 1883-84, Arzelà 1895)
"A family of continuous real-valued functions on $[a,b]$ is uniformly bounded and equicontinuous if and only if every sequence has a uniformly convergent subsequence."

  • Introduced notion of equicontinuity
  • A compactness result for function spaces, anticipating the development of functional analysis

Hilbert basis theorem (1888)
"If $R$ is a Noetherian ring, then the polynomial ring $R[X]$ is also Noetherian."

  • Invariant theory: studies invariants of algebraic expressions under linear change of variables
    • eg. "The discriminant $b^2-4ac$ of the binary quadratic form $ax^2+bxy+cy^2$ is invariant under a linear change of coordinates for $(x,y)$ with determinant 1. In fact, every such invariant is a polynomial in the discriminant."
  • Hilbert used this result to show that every space of invariant polynomials is finitely generated
  • On the non-constructive nature of Hilbert's proof, Gordan famously remarked: 'This is not mathematics, it is Theology!'
  • (Who would win? 20 years of calculations or 3 pages of abstraction)

Minkowski's convex body theorem (1889)
"Every convex body in $\mathbb R^n$ which is symmetric with respect to the origin and has volume greater than $2^n$ contains a nonzero point with integer coordinates."

  • Foundational result for Minkowski's geometry of numbers, studying rings of algebraic integers

Generalised Stokes's theorem (Volterra 1889)
"The integral of a differential form over the boundary of an orientable manifold is equal to the integral of its exterior derivative over the manifold."

  • Special cases: fundamental theorem of calculus, divergence theorem, Green's theorem, Stokes's theorem
  • Differential form: something that can be integrated over a manifold (eg. line, surface)
  • Introduced by Cartan (1899), differential forms are fundamental in the study of smooth manifolds
  • cf. Fundamental theorem of calculus

Hilbert's Nullstellensatz (1893)
"There is a one-to-one correspondence between algebraic varieties in $\mathbb C^n$ and radical ideals in $\mathbb C[X_1,\ldots,X_n]$."

  • German for 'zero-locus theorem'
  • (Affine) algebraic variety: common zero locus of a set of polynomials in $n$ variables
  • Corollary: "If $f,f_1,\ldots,f_k$ are $n$-variable polynomials over $\mathbb C$, such that $f$ vanishes identically on the common zero locus of $f_1,\ldots,f_k$, then $$f^r=g_1f_1+\cdots+g_kf_k$$ for some integer $r$ and polynomials $g_1,\ldots,g_k$."
  • Corollary: "Affine algebraic varieties are determined up to isomorphism by their coordinate rings (polynomials restricted to the variety)."
  • Fundamental result in algebraic geometry, linking algebraic varieties (geometry) with coordinate rings (algebra)

Classification of semisimple Lie algebras (Killing 1888-90, Cartan 1894)
"Every simple Lie algebra over $\mathbb C$ either lies in one of the infinite families $A_n$, $B_n$, $C_n$, $D_n$, or is one of $E_6$, $E_7$, $E_8$, $F_4$, or $G_2$."

  • Lie group: smooth manifold with compatible group structure
  • Lie algebra: corresponds to tangent space at identity; 'infinitesimal transformations'
  • Fundamental result in Lie theory; leads to classification of semisimple Lie groups
  • Can be stated in simpler terms with Dynkin diagrams (1947)

Picard–Lindelöf theorem (1894)
"The initial-value problem$$y'(t)=f(t,y(t)),\quad y(t_0)=y_0$$has a unique solution around $t_0$ if $f$ is uniformly Lipschitz in $y$ and continuous in $t$."

  • aka. the existence and uniqueness theorem for first-order ODEs

Prime number theorem (Hadamard 1896/de la Vallee Poussin 1896)
"The number of prime numbers up to $N$ is asymptotic to $\frac N{\log N}$."

  • Carrying out Riemann's programme from his seminal 1859 paper: properties of the analytic extension of the zeta function yield prime asymptotics
  • The crowning achievement of 19th century analytic number theory
  • Later proofs (Erdős 1948/Selberg 1948) circumvent the use of complex analytic methods
  • cf. Euler product for the zeta function

Early 20th Century

Central limit theorem (various, eg. Cauchy 1853, Chebyshev 1887, Markov 1898, Lyapunov 1901)
"The sum of $n$ independent random variables with mean 0 and variance 1, divided by $\sqrt n$, is distributed as the standard normal in the limit (under certain conditions)."

  • Independent of the distribution of the given random variables; explains the universality of the bell curve in statistics
  • Successive weakening of hypotheses, eg. finiteness of support, finiteness of moments
  • Different versions track the development of classical and modern probability theory

Russell's paradox (1901)
"The collection of sets which do not contain themselves cannot be a set."

  • Exposed inconsistencies in naive set theory, where any definable collection is a set
  • Part of the foundational crisis in mathematics (What is a set? Baby don't hurt me)
  • Led to the development of axiomatic set theory (eg. Zermelo 1908, Fraenkel 1922/Zermelo 1922, von Neumann 1925)
  • cf. Cantor's theorem

Classification of compact surfaces (various, eg. Dehn–Heegaard 1907)
"Compact surfaces are determined (up to homeomorphism) by orientability and Euler-Poincare characteristic."

  • Topology: the study of properties of geometrical objects preserved under continuous deformation (homeomorphism)
  • Several proof attempts were made previously (eg. Mobius 1863, Jordan 1866, von Dyck 1888); rigour was difficult to achieve, since the modern concepts of 'surface' and 'homeomorphism' weren't definitively reached until 1930
  • Classification of families of spaces is a central goal of topology; the classification of higher dimensional manifolds is considerably harder

Uniformisation theorem (Poincare 1907/Koebe 1907)
"Every simply connected Riemann surface is biholomorphic to the plane, the open disc, or the sphere."

  • Riemann surface: complex manifold of (complex) dimension 1
  • Motivated by the problems of parametrising algebraic curves over $\mathbb C$ (Klein), and discrete subgroups of $\operatorname{PSL}(2,\mathbb R)$ (Poincare)
  • Corollary: "Every Riemann surface admits a conformal metric of constant curvature."
  • Can be seen as geometrisation in 2 dimensions, cf. Thurston geometrisation

Lebesgue dominated convergence theorem (1908)
"Let $(f_n)$ be a sequence of real-valued measurable functions which converge pointwise to $f$. If there exists an integrable function $g$ such that $|f_n|\leq g$ for all $n$, then the Lebesgue integral of $|f_n-f|$ tends to 0."

  • In his 1902 thesis, Lebesgue introduces measure theory and a new notion of integration
  • Lebesgue integration is more general and has better properties than Riemann integration (such as this result)

Noether's theorem (1915)
"Every continuous family of symmetries of a physical system gives rise to a conservation law."

  • Examples:
    • translational invariance → conservation of momentum
    • rotational invariance → conservation of angular momentum
    • time invariance → conservation of energy
  • Illustrates the fundamental importance of symmetry in physics
  • Gauge theories are based on 'gauge symmetries' built into the theory, and are among the most successful physical theories (eg. the Standard Model)

Artin reciprocity law (1927)
"Let $L/K$ be an extension of number fields with discriminant $\Delta$. There is a group homomorphism from the ray class group $Cl_\Delta$ to $\operatorname{Gal}(L/K)$, which maps any prime ideal $\mathfrak p$ in $K$ coprime to $\Delta$ to $\operatorname{Frob}_{\mathfrak p}$."

  • Corollary ($K=\mathbb Q$): "There is a group homomorphism $(\mathbb Z/\Delta\mathbb Z)^*\to\operatorname{Gal}(L/\mathbb Q)$, which maps any rational prime $p$ not dividing $\Delta$ to $\operatorname{Frob}_p$."
  • Central theorem in class field theory, the study of abelian extensions of number fields (among others)
  • Generalises Quadratic reciprocity, cubic reciprocity (Jacobi 1837/Eisenstein 1844), biquadratic reciprocity (Eisenstein 1844), …

Krull's principal ideal theorem (1928)
"If $R$ is a Noetherian ring and $I$ is a proper principal ideal of $R$, then every minimal prime ideal over $I$ has height at most 1."

  • Basis for Krull's dimension theory, which gives an algebraic definition for the dimension of an algebraic variety
  • Geometric analogue: the dimension of an algebraic variety can decrease by at most 1 when intersected with a hypersurface

Ramsey's theorem (1928)
"For any $k$, every colouring of edges of a sufficiently large complete graph in two colours contains a monochromatic clique of size $k$."

  • eg. "In any group of 6 people, either 3 of them are mutual friends or 3 of them are mutual strangers."
  • First result in Ramsey theory, the study of existence of given substructures

Spectral theorem for self-adjoint operators (von Neumann 1929/Stone 1932)
"Every self-adjoint operator $T$ defined densely on a Hilbert space $H$ can be decomposed as an integral$$T=\int_{\mathbb R}\lambda\,dE_\lambda,$$where $E_\lambda$ is a resolution of the identity."

  • Spectral theory: generalising eigenspace decomposition for finite-dimensional matrices
  • Resolution of the identity: increasing, right-continuous family of projections, with limits 0 at $-\infty$ and $\operatorname{Id}$ at $+\infty$
  • Hilbert (1906): "If $T$ is a compact operator, then $H$ has an orthonormal basis of eigenvectors of $T$."
  • Von Neumann provided rigorous foundations for and unified the two formulations of quantum mechanics, Schrodinger's wave mechanics and Heisenberg's matrix mechanics

Birkhoff's ergodic theorem (1931)
"Let $X$ be a finite measure space, and $T:X\to X$ be an ergodic map. Then for any integrable $f:X\to\mathbb R$, the average value of $f$ on almost every orbit is equal to average value of $f$ on $X$."

  • ie. time average = space average
  • Ergodic map: measure-preserving, and any invariant subset is of zero measure or full measure; ie. T is 'mixing' on X
  • Central result in ergodic theory: study of statistical properties of dynamical systems; has roots in statistical mechanics (Boltzmann)
  • Generalises to many different settings

Gödel's incompleteness theorems (Gödel 1931)
"If a consistent axiomatic system is strong enough to express elementary arithmetic, then there are statements it can neither prove nor disprove; in particular, it cannot prove its own consistency."

  • Arithmetisation of logic: statements and proofs can be represented as numbers, and thus can be discussed within the system itself
  • Dooming Hilbert's program of formalising all of mathematics in a complete, consistent and decidable way
  • cf. Halting problem

Halting problem (Turing 1936)
"There is no general algorithm to decide whether a Turing machine halts on a given input in finite time."

  • One of the first problems proven to be undecidable; shows the limits of computability
  • Especially significant in the context of the Church–Turing thesis: "everything that is intuitively computable is computable with a Turing machine" (What is computability? Baby don't hurt me)
  • cf. Gödel's incompleteness theorems

Whitney embedding theorem (1936, 1944)
"Any smooth $n$-dimensional manifold can be smoothly embedded into $\mathbb R^{2n}$."

  • Smooth embedding: smooth injective map with injective differential
  • ie. intrinsic and extrinsic definitions of smooth manifolds are equivalent
  • Introduced Whitney's trick, useful for geometric topology in more than 4 dimensions

Sobolev embedding theorem (1938)
"Conditions for embedding one Sobolev space into another"

  • Sobolev space: (completion of) the space of functions with bounds on derivatives up to some order
  • Sobolev inequalities: trade smoothness for integrability
  • Important in the theory of weak solutions of PDEs

Itô's lemma (1944)
"If $X_t$ is a Wiener process and $F(x,t)$ is a twice-differentiable function, then $dF(X_t,t)=F_t\,dt+F_x\,dX+\frac12F_{xx}\,dt$."

  • ie. Chain rule for stochastic calculus
  • Stochastic calculus: extension of calculus to random processes
  • Wiener process: Brownian motion, the continuous-time analogue of a random walk
  • First two terms are same as ordinary chain rule; quadratic correction term arises from $\operatorname{Var}(X_t)=t$
  • Applications in mathematical finance, eg. Black–Scholes equation for option pricing (1970)

Shannon's noisy channel coding theorem (1944)
"Any noisy communications channel has a maximum rate of information transmission, below which there exist error-correcting codes such that the probability of error at the receiver can be made arbitrarily small, and above which no such codes exist."

  • Introduced the notion of entropy from statistical mechanics to quantify information
  • Shannon's 1948 paper single-handedly created the field of information theory
  • Direct applications to coding theory, including data compression and error detection/correction

Erdős's lower bound for Ramsey numbers (1947)
"There exists a 2-coloring of the edges of any complete graph on $Ck2^{k/2}$ vertices with no monochromatic $k$-clique."

  • Proof sketch: almost every random colouring works
  • Early example of Erdős's powerful probabilistic method
  • The current best known lower bound is only a factor of 2 bigger than this! (Compare to the current best known upper bound, which is like $3.99^k$)

Existence of Nash equilibria (von Neumann–Morgenstern 1944, Nash 1950)
"Any $n$-player game with a finite set of actions has a mixed-strategy Nash equilibrium."

  • Foundational result in game theory
  • von Neumann–Morgenstern's 1944 text created the field of game theory, and proved this for 2-player zero-sum games
  • Nash proves this generalisation in his one-page long (!!) 1950 paper
  • Applications in diverse fields, such as evolutionary biology and economics

Late 20th Century

Cartan's theorems A and B (1951)
"A: A coherent sheaf $F$ over a Stein manifold $X$ is locally finitely generated by its local sections.
B: The sheaf cohomology $H^p(X,F)$ is zero in dimensions $p>0$."

  • Foundational in the theory of functions in several complex variables, and in sheaf cohomology
  • Immediately solved the two Cousin problems (1895) for several complex variables
  • Partial progress by the German complex analysts using technical arguments were rendered obsolete: 'the French have tanks and we have bows and arrows!'
  • cf. GAGA theorem

Lax–Milgram theorem (1954)
"Let $V$ be a Hilbert space and $A$ a bilinear form on $V$ such that$$A(v,w)\leq C|v||w|\quad\text{and}\quad A(v,v)\geq c|v|^2\text{ for all }v,w\in V.$$Then for every linear functional $L$ on $V$, there exists a unique $u$ such that $A(u,v)=L(v)$ for all $v$."

  • Weak solution to a PDE: a distribution which satisfies an integral formulation of the PDE, against all test functions
  • Easy to show existence of weak solutions in nice function spaces, like $L^p$ or Sobolev spaces
  • This result is often used to show existence of strong solutions '$u$' with desired regularity from weak solutions '$L$'

Yoneda lemma (1954)
"For any functor $F$ from a locally small category $\mathcal C$ to $\operatorname{Set}$, and any object $A$ in $C$, the natural transformations $\operatorname{Nat}(\operatorname{Hom}(A,-),F)$ are in bijection with the elements of $F(A)$, and this bijection is natural in $A$ and $F$."

  • 'The hardest triviality in mathematics'
  • Corollary: "The maps out of an object uniquely determine the object (up to unique isomorphism)," ie. 'Tell me who your friends are, and I will tell you who you are.'
  • Category theory: abstracts mathematical structures (eg. sets, groups, topological spaces) to categories, containing objects and morphisms
  • Categorical concepts and language are now indispensable in mathematics

Roth's theorem (1955)
"For any irrational algebraic number $x$ and any $c>0$, there are only finitely many integer solutions to $\left|x-\frac pq\right|<\frac1{q^{2+c}}$."

  • ie. algebraic numbers cannot be too well-approximated by rationals

Long exact sequence of a derived functor (Cartan-Eilenberg 1956/Grothendieck 1957)
"Given a left-exact functor $F$, any short exact sequence$$0\to A\to B\to C\to0$$gives rise to a long exact sequence$$0\to F(A)\to F(B)\to F(C)\to R^1F(A)\to R^1F(B)\to R^1F(C)\to R^2F(A)\to\cdots$$"

  • $R^iF$: $i$th derived functor of $F$
  • Unified various examples where a short exact sequence gives rise to a long exact sequence, eg. in homology
  • The Cartan–Eilenberg text Homological Algebra unified previous homology theories and revolutionised the field
  • Grothendieck's 'Tohoku paper' introduced the concept of abelian categories, uniting Cartan-Eilenberg functors on modules and sheaf cohomology

GAGA theorem (Serre 1956)
"The category of coherent algebraic sheaves on a complex projective variety is equivalent to the category of coherent analytic sheaves on the corresponding analytic space."

  • Corollary: (Chow 1949) "Analytic subvarieties of complex projective $n$-space are algebraic varieties."

Existence of exotic spheres (Milnor 1956)
"There is a smooth manifold which is homeomorphic to, but not diffeomorphic to, the standard 7-sphere."

  • Spawned the field of differential topology: the study of smooth manifolds and their invariants under diffeomorphisms
  • Open problem (Smooth Poincare conjecture): Is there an exotic 4-sphere?

Feit–Thompson theorem (1963)
"Every finite group of odd order is solvable."

  • Proof was 255 pages long, which was unprecedented in group theory
  • Provided a plausible opening to attack the full classification of finite simple groups
  • cf. Classification of finite simple groups

Independence of the continuum hypothesis (Gödel 1940, Cohen 1963)
"The continuum hypothesis can neither be proven nor disproven from the axioms of Zermelo–Fraenkel set theory (assuming these are consistent)."

  • Gödel showed that CH holds in the constructible universe, a model of the ZF axioms, so it is impossible to disprove CH from ZF
  • Cohen introduced the method of forcing, used to construct models of axioms with specific properties
  • Forcing remains a fundamental technique in set theory and logic

Atiyah–Singer index theorem (1963)
"For an elliptic differential operator on a compact manifold, the difference between the dimensions of the kernel and cokernel is a topological invariant."

  • ie. "The analytical index is equal to the topological index"
  • Subsumes many major theorems, eg. Chern–Gauss–Bonnet (1944), Hirzebruch–Riemann–Roch (1954), Hirzebruch signature theorem (1954), …
  • Cross-fertilisation with physics, eg. string theory (Witten), instantons and 4-manifolds (Donaldson)

KAM theorem (Kolmogorov 1954, Moser 1962, Arnold 1963)
"Small perturbations of an integrable Hamiltonian system preserve most invariant tori."

  • An almost-periodic orbit can be destroyed by small perturbations, if they are in resonance
  • If the frequencies are 'sufficiently irrational,' the orbit remains quasi-periodic
  • Applications to stability of $n$-body problem (eg. the solar system) and the onset of chaos in dynamical systems

Hironaka's resolution of singularities (1964)
"Every algebraic variety over a field of characteristic 0 admits a proper birational map from a non-singular variety."

  • ie. The solutions of a system of polynomial equations can be parametrised by a smooth manifold
  • Introduced new techniques in algebraic geometry and commutative algebra

Mostow rigidity theorem (Mostow 1968, Prasad 1971)
"Complete finite-volume hyperbolic manifolds of dimension at least 3 are completely determined up to isometry by their fundamental group."

  • For high-dimensional hyperbolic manifolds, topology determines geometry
  • Contrast with 2-dimensional case: the hyperbolic surfaces with genus $g>1$ form a $(6g-6)$-parameter family
  • Applications to hyperbolic geometry and low-dimensional topology

Cook–Levin theorem (1971)
"The Boolean satisfiability problem (SAT) is NP-complete."

  • P: decision problems that can be solved in polynomial time by a Turing machine
  • NP: decision problems with proofs verifiable in polynomial time by a Turing machine
  • NP-complete: in NP, and every NP problem can be reduced to it in polynomial time by a Turing machine
  • Corollary: "If SAT can be solved in polynomial time, so can every problem in NP, ie. P=NP."
  • Karp (1972) lists 21 NP-complete problems
  • P $\overset?=$ NP remains one of the most important open problems in theoretical computer science

Weil conjectures (Dwork 1960, Grothendieck 1965, Deligne 1974)
"The local zeta function of a variety over a finite field is rational, has a functional equation, satisfies the Riemann hypothesis, and is related to the complex Betti numbers."

  • Surprising connection to algebraic topology: Weil noted that the conjectures follow from constructing a 'cohomology theory,' with appropriate analogues for the Lefschetz fixed point theorem and Poincare duality
  • Grothendieck developed the theory of etale cohomology (Grothendieck–Artin 1960) to prove 3 of the 4 conjectures, and envisioned an approach with his 'standard conjectures'
  • Deligne used an analytic estimate to prove the Riemann hypothesis analogue, much to the disappointment of Grothendieck

Szemeredi's theorem (1975)
"Any subset of the positive integers with positive upper density contains arbitrarily long arithmetic progressions."

  • First proven with intricate combinatorial arguments
  • Later proofs link to diverse areas such as ergodic theory (Furstenberg 1977) and Fourier analysis (Gowers 2001)

Four colour theorem (Appel–Haken 1976)
"The vertices of any planar graph can be coloured with at most 4 colours such that no two adjacent vertices are the same colour."

  • First major theorem proven with computer assistance (What is proof? Baby don't hurt me)

Classification of finite simple groups (various, 1983)
"Every finite simple group is either a cyclic group, an alternating group, a group of Lie type, or one of 26 sporadic groups."

  • aka. 'The Enormous Theorem,' or 'The Thirty Years' War' (Gorenstein)
  • Finite simple groups are the 'building blocks' of all finite groups (Jordan–Hölder theorem)
  • Gorenstein announced (prematurely) the completion of the proof in 1983; the quasithin case was only resolved in 2004
  • Longest proof of a single theorem: 10,000+ pages, 500+ papers, 100+ authors
  • cf. Feit–Thompson theorem

Jones polynomial (1984)
"The Jones polynomial of a knot is a knot invariant."

  • Derived from operator algebras
  • First knot invariant which detects that the trefoil knot is chiral, ie. it is not isotopic to its mirror image
  • Can be computed recursively from simple skein relations; similar relations were found by Conway (1969) for the Alexander polynomial
  • Led to the discovery of the HOMFLY-PT polynomial (1985)

Gromov non-squeezing theorem (1985)
"There is no symplectomorphism in $\mathbb R^{2n}$ mapping a ball to a cylinder of smaller radius."

  • ie. 'it is impossible to squeeze a symplectic camel into the eye of a needle'
  • An example of a symplectomorphism: time evolution in phase space
  • Corollary: "if the initial states of a physical system in phase space are spread out over the unit ball in $\mathbb R^{2n}$, then these cannot be squeezed into a state where the coordinates of the first particle $(q_1,p_1)$ spreads out less than initially."
  • First restriction on symplectomorphisms (other than being volume-preserving)
  • Introduced concept of pseudoholomorphic curves, which led to symplectic invariants such as Floer homology (1988)

Fermat's last theorem (Wiles 1993, Taylor–Wiles 1994)
"For $n\geq3$, no two positive $n$th powers sum to an $n$th power."

  • First stated by Fermat in an infamous note in Diophantus's Arithmetica: 'I have discovered a truly remarkable proof which this margin is too small to contain'
  • A major impetus for algebraic number theory in the 19th and 20th century, cf. Unique factorisation of ideals
  • Proved an important case of the modularity theorem (Breuil–Conrad–Diamond–Taylor 2001), relating elliptic curves to modular forms; part of the Langlands programme

Personal picks

Hartman–Grobman theorem (Grobman 1959/Hartmann 1959)
"Let $f:\mathbb R^n\to\mathbb R^n$ be a smooth function, and let $x_0$ be a point such that $f(x_0)=0$, and $A=Df(x_0)$ has no eigenvalue with real part 0. Then the differential equation $\frac{dx}{dt}=f(x)$ is topologically conjugate to its linearisation $\frac{dX}{dt}=AX$ near $x_0$."

  • ie. ODEs can be linearised around hyperbolic fixed points

Takens's theorem (1981)
"If $x$ is a generic observable function of a dynamical system, then for $k$ large enough, $(x(t),x(t+1),\ldots,x(t+k-1))$ is an embedding of the attractor into $\mathbb R^k$."

  • ie. "The attractor of a generic dynamical system can be reconstructed from the time-delay plot of a single observable."
  • Applications to time-series analysis, eg. computing the fractal dimension of attractors

Connectedness of the Mandelbrot set (Douady–Hubbard 1985)
"The Mandelbrot set is connected."

  • Mandelbrot set: set of complex numbers c such that the orbit of 0 under $z\mapsto z^2+c$ stays bounded
  • Mandelbrot first conjectured the set was disconnected, using early computer printouts which missed intricate filaments
  • Advent of personal computing made generating fractals accessible to the public
  • The local connectedness of the Mandelbrot set is a central open problem in complex dynamics

Honourable mentions

Here is a list of theorems for which I can't decide if they belong to a greatest theorems list (or I just haven't finished researching them before writing this post 😅); give yourself 0.5 points for each of these, I guess:

Chinese remainder theorem (Sunzi ~450/Aryabhata ~510, Qin 1247)

Principle of inclusion-exclusion (Moivre 1718)

Basel problem (Euler 1734)
"The sum of reciprocals of squares $1+1/4+1/9+\cdots$ is equal to $\pi^2/6$."

  • Euler also evaluated the sums of reciprocals of even powers up to 26
  • Justification of Euler's original argument had to wait a century

Fermat's two-squares theorem (Euler 1749)
"An odd prime is the sum of two squares if and only if it is congruent to 1 mod 4."

  • Stated by Girard in 1625 and Fermat in 1640, without proof

Elliptic integrals (Fagnano/Euler ~1750)

Lagrange's four-squares theorem (1770)
"Every positive integer is the sum of four squares."

Least squares regression (Gauss 1795/Legendre 1805)

Cauchy–Schwarz inequality (Cauchy 1821)

  • Integral form due to Bunyakovsky (1859) and Schwarz (1885)

Intermediate value theorem (Bolzano 1817/Cauchy 1821)
"A continuous real-valued function $f$ on $[a,b]$ attains all values between $f(a)$ and $f(b)$."

Pigeonhole principle (Dirichlet 1834)
"If there are more pigeons than holes, then some hole has more than one pigeon."

  • Or, "If there are more holes than pigeons, then some pigeon has more than one hole in it."

Matrix-tree theorem (Borchardt 1860)
"The number of spanning trees in a graph is equal to any cofactor of its Laplacian matrix."

Cayley–Hamilton theorem (Frobenius 1878)

Runge–Kutta method (Runge 1895, Kutta 1901)

Fredholm alternative (1903)

Perron–Frobenius theorem (Perron 1907/Frobenius 1912)

Brouwer's fixed point theorem (Bohl 1904/Brouwer 1909, Hadamard/Brouwer 1910)

Banach fixed point theorem (1922)

Banach–Tarski paradox (1924)

Radon–Nikodym theorem (Radon 1913, Nikodym 1930)

Classification of finitely generated modules over a PID

Pontryagin duality (1934)

Measurable Riemann mapping theorem (Morrey 1938)

Sard's theorem (Morse 1939, Sard 1942)
"The set of critical values of a smooth function between manifolds has Lebesgue measure zero."

Polya enumeration theorem (Redfield 1927/Polya 1937)

Nakayama lemma (Krull 1951, Azumaya 1951/Nakayama 1951)

Selberg trace formula (1956)

Max-flow min-cut theorem (Ford–Fulkerson 1956)

Hoeffding's inequality (1963)

Baker's theorem (1966)

Carleson's theorem (1966)

CHSH inequality (1969)

Hilbert's tenth problem (Matiyasevich 1970)
"Every computably enumerable set is Diophantine."

No-cloning theorem (Park 1970/Wootters–Zurek 1982/Dieks 1982)
"You cannot make a copy an arbitrary unknown quantum state."

Lovász local lemma (Lovász–Erdős 1973, Lovász 1977)


Final remarks

The maximum possible score is 99; how many did you get? What are some results that I missed which you think deserved to be on this list (and why)?

On the idea of a "top 100" list for mathematics: How many of the mathematicians on this list do you think understood the importance of their own results? What makes a mathematical result important/significant? Or, who makes a result important? If you were aiming for a Fields Medal, could you try to tailor your next theorem to be viewed as important by the people who matter? (Cf. Cédric Villani, Birth of a Theorem.) How much of the same reasoning applies if you were a graduate student trying to get your research published?

Comments

Popular posts from this blog

SMO Open 2024 ??% Speedrun

Musings about the IMO