Extremal rays in families of inequalities (I)
(This is Yan Sheng.)
What does it mean to "properly understand" some mathematical result? For me, I find it the most satisfying when I can answer the following two questions:
- What is the minimal set of special cases that I need to verify to prove it?
- How could I have come up with it myself?
In this and the next blog post, I'll describe two different situations in olympiad inequalities that I've tried to understand better recently, by answering the two questions above.
Theorem (Popoviciu 1965): Let be a convex function. Then for any , we have What an interesting statement! It's not immediately clear how to prove it with Jensen's inequality, and it makes me wonder what other similar inequalities hold for convex functions. Let's try proving it by doing the least amount of work possible: is there a finite set of special cases we can check in order to verify the inequality? (The interested reader is invited to pause here to think about this.)
The key idea here is that instead of varying , what we should be doing is to fix , and to vary instead. In general, given , we want a minimal set of conditions which would verify that the inequalityholds for every convex function .
Geometrically, the family of all convex functions is an example of (what is confusingly called) a convex cone, because any nonnegative linear combination of convex functions is also convex; in other words, if are convex, then for any we have is also convex. Hence if there is some "basis set" whose nonnegative linear combinations generate , then it suffices to check for .
This suggests that we should be looking at the convex functions which aren't nonnegative linear combinations of other convex functions (except trivially in terms of multiples of itself). Geometrically, for any nonzero , we write for the ray ; a ray is called an extremal ray if any line segment in which intersects actually lies entirely inside .
Now we want to take to be one representative from each extremal ray of . What are they?
One property that we know about convex functions is that (essentially) ; notice that if at some point , and is a nonnegative linear combination of convex functions , then for all . Hence the extremal convex functions should have at as many points as possible. If is identically 0 then is an affine linear function; otherwise if is nonzero at a single point then is piecewise linear in two pieces, for exampleIt turns out that these indeed give us enough extremal functions to form :
Proposition: If is a convex function, thenProof sketch: Write the integral asHence every convex function is the "nonnegative linear combination" of functions of the form and an affine linear function. (The quotation marks are there because we're summing infinitely many of them together, as an integral.)
Back to the inequality , we notice that affine linear functions are both convex and concave, and that can be written in terms of . Hence we have the minimal set of conditions as follows:
Theorem: Let . Then the inequalityholds for every convex function if and only if:
- ; and
- for all .
Proof sketch: For any convex function , we haveNow substitute into , and note that every term is nonnegative by conditions 1 and 2.
The astute reader will realise that I've told many lies and half-truths in the previous few paragraphs. Here's a list:
- In fact, is not actually an extremal convex function, since you can always add an affine linear function to it; we should really have been working modulo addition by affine linear functions.
- The fact that convex cones are generated by their extremal rays is false in general, see this math.SE answer. But it worked out in this case, so I guess all's well that ends well...?
- doesn't really exist everywhere, but if you say the magic words "distributional derivative" and wave the magic wand of integration by parts twice (i.e., ) then you're good to go.
Back to Popoviciu's inequality:The first condition of our theorem is easy to check. The second condition, when we absorb into the variables , states thatwhich is known as Hlawka's inequality. (Here are some proofs from Cut The Knot.)
So there is a very real sense in which Popoviciu's inequality holds because of Hlawka's inequality, which is just a special case of Popoviciu. More generally, our theorem says roughly that every inequality about absolute functions of linear functions in real variables implies an inequality about convex functions.
Exercises:
- Prove that in the second condition in the theorem, it suffices to take . (This fulfils the promise of "finitely many checks" at the start of this post, in a sense.)
- Let's prove Hlawka's inequality by checking finitely many cases. We may assume that ; putting the variable back into the inequality, consider the functionWhich values of do we need to check to ensure that for all (and in particular )? Hint: is piecewise linear.
- Using Hlawka's inequality, prove the weighted version of Popoviciu's inequality: if and , then for all convex we have
- What does the triangle inequality translate to for convex functions? (Also derive the weighted version, like in the previous exercise.)
- You might expect that Hlawka's inequality generalises to more variables, where the terms are like inclusion-exclusion:Unfortunately, this doesn't work; find a counterexample for .
- Generalise Hlawka's inequality to variables (in a different way from the previous exercise, which doesn't work), and deduce the corresponding inequality for convex functions. Check against Popoviciu's original paper.
- Given , derive a minimal set of conditions which would verify that the inequalityholds for every increasing function . Hence prove Schur's inequality: for all and , we have
- In the spirit of proving inequalities from special cases: Suppose that , and are linear polynomials with constant coefficient 0, such that for all we haveProve that the same inequality holds for all . (For example, Hlawka's inequality holds in any finite dimensional vector space.)
Comments
Post a Comment