Extremal rays in families of inequalities (II)

(This is Yan Sheng.)

Last time we studied a family of inequalities involving absolute values on linear polynomials, and asked the question of how lazy we can be to prove inequalities of that form. This week, we will be applying the same approach to a different family: 3-variable homogeneous symmetric inequalities of low degree.

A polynomial P(x,y,z)P(x,y,z) in 3 variables is called homogeneous of degree dd if each of its terms is of degree dd, and symmetric if P(x,y,z)=P(x,y,z)P(x,y,z)=P(x',y',z') for all permutations x,y,zx',y',z' of x,y,zx,y,z. Inequalities involving homogeneous symmetric polynomials include the AM-GM inequality(x+y+z3)3xyz,\left(\frac{x+y+z}3\right)^3\ge xyz,and the Schur inequalityxr(xy)(xz)+yr(yx)(yz)+zr(zx)(zy)0x^r(x-y)(x-z)+y^r(y-x)(y-z)+z^r(z-x)(z-y)\ge0for integers r0r\ge0.

For the rest of this post, write Pd+\mathcal P^+_d for the family of all homogeneous symmetric polynomials P(x,y,z)P(x,y,z) of degree dd such that P(x,y,z)0P(x,y,z)\ge0 holds for all x,y,z0x,y,z\ge0. Our Main Problem is to describe Pd+\mathcal P^+_d.

Exercise: Determine Pd+\mathcal P^+_d for d2d\le2. (The most interesting inequality here is x2+y2+z2xy+yz+zxx^2+y^2+z^2\ge xy+yz+zx.)

Warmup: degree 3

From here on, we will use the symmetric sum notation[a,b,c]=16(xaybzc+),[a,b,c]=\frac16(x^ay^bz^c+\cdots),where there are 6 terms in the sum, one for each permutation of x,y,zx,y,z. For instance, we have[3,0,0]=13(x3+y3+z3),[2,1,0]=16(x2y+y2z+z2x+xy2+yz2+zx2),[1,1,1]=xyz.\begin{align*}[3,0,0]&=\frac13(x^3+y^3+z^3),\\ [2,1,0]&=\frac16(x^2y+y^2z+z^2x+xy^2+yz^2+zx^2),\\ [1,1,1]&=xyz.\end{align*}By AM-GM (or Muirhead, iykyk), we have [3,0,0][2,1,0][1,1,1][3,0,0]\ge[2,1,0]\ge[1,1,1].

Let's start by considering our Main Problem in degree 3. It should be clear that every homogeneous symmetric polynomial of degree 3 can be written asP(x,y,z)=K[3,0,0]+A[2,1,0]+B[1,1,1],P(x,y,z)=K[3,0,0]+A[2,1,0]+B[1,1,1],for some constants K,A,BRK,A,B\in\mathbb R, so the Main Problem becomes: "for which triples (K,A,B)(K,A,B) do we have P0P\ge0 for all x,y,z0x,y,z\ge0?"

It seems sensible as a first step to plug in specific points (x,y,z)(x,y,z) to get some conditions that K,A,BK,A,B have to satisfy: (x,y,z)3[3,0,0]3[2,1,0]3[1,1,1](1,0,0)100(1,1,0)210(1,1,1)333\begin{matrix} (x,y,z)&3[3,0,0]&3[2,1,0]&3[1,1,1]\\\hline (1,0,0)&1&0&0\\ (1,1,0)&2&1&0\\ (1,1,1)&3&3&3 \end{matrix} Hence if P0P\ge0 at these points, then we must haveK0,2K+A0,K+A+B0.\begin{align*}K&\ge0,\\2K+A&\ge0,\\K+A+B&\ge0.\end{align*}Geometrically, this is a region in R3\mathbb R^3 cut out by 3 planes; the pairwise intersections of the planes give us 3 vectors in the directions of the edges, whose nonnegative linear combinations generate the whole region. Hence we have(KAB)=c1(121)+c2(011)+c3(001)\begin{pmatrix}K\\A\\B\end{pmatrix}=c_1\begin{pmatrix}1\\-2\\1\end{pmatrix}+c_2\begin{pmatrix}0\\1\\-1\end{pmatrix}+c_3\begin{pmatrix}0\\0\\1\end{pmatrix}for some c1,c2,c30c_1,c_2,c_3\ge0.

But now note that[3,0,0]2[2,1,0]+[1,1,1]0,(Schur)[2,1,0][1,1,1]0,(AM-GM)[1,1,1]0,(trivial)\begin{align*}[3,0,0]-2[2,1,0]+[1,1,1]&\ge0,&&(\text{Schur})\\ [2,1,0]-[1,1,1]&\ge0,&&(\text{AM-GM})\\ [1,1,1]&\ge0,&&(\text{trivial})\end{align*}so every (K,A,B)(K,A,B) in this region does in fact yield a nonnegative PP! Hence we have proved:

Theorem: Let P(x,y,z)P(x,y,z) be a homogeneous symmetric polynomial of degree 3. Then the following are equivalent:

  1. P(x,y,z)0P(x,y,z)\ge0 holds for all x,y,z0x,y,z\ge0 (i.e., PP3+P\in\mathcal P^+_3);
  2. P(x,y,z)0P(x,y,z)\ge0 holds at (x,y,z)=(1,0,0),(1,1,0),(1,1,1)(x,y,z)=(1,0,0),(1,1,0),(1,1,1); and
  3. PP is a nonnegative linear combination of the three polynomials:
    • [3,0,0]2[2,1,0]+[1,1,1][3,0,0]-2[2,1,0]+[1,1,1];
    • [2,1,0][1,1,1][2,1,0]-[1,1,1]; and
    • [1,1,1][1,1,1]. \square

The slogan here is "every 3-variable homogeneous symmetric inequality of degree 3 can be proved by checking 3 points/can be proved with AM-GM and Schur". I first noticed and posted about this fact in 2011, but it has been known since at least Stolarsky (1971) and van Albada (1971), in the context of triangle inequalities.

I'd say that this theorem a pretty satisfactory, complete solution of our Main Problem in degree 3. In fact we see that there are actually two different aspects of the Main Problem:

  1. The Decision Problem: Find a minimal set of points XdX_d such that P(x,y,z)0P(x,y,z)\ge0 for all (x,y,z)Xd(x,y,z)\in X_d implies that P(x,y,z)0P(x,y,z)\ge0 for all x,y,z0x,y,z\ge0.
  2. The Generation Problem: Find a minimal set Sd\mathcal S_d of homogeneous symmetric polynomials of degree dd, whose nonnegative linear combinations form exactly Pd+\mathcal P^+_d.

By the discussion above and the preceding exercise, we may take for exampleS1={x+y+z},S2={x2+y2+z2xyyzzx,={xy+yz+zx},S3={[3,0,0]2[2,1,0]+[1,1,1],={[2,1,0][1,1,1],={[1,1,1]},\begin{align*}\mathcal S_1&=\{x+y+z\},\\\mathcal S_2&=\{x^2+y^2+z^2-xy-yz-zx,\\&\phantom{{}=\{}xy+yz+zx\},\\\mathcal S_3&=\{[3,0,0]-2[2,1,0]+[1,1,1],\\&\phantom{{}=\{}[2,1,0]-[1,1,1],\\&\phantom{{}=\{}[1,1,1]\},\end{align*}andX1={any nonzero point},X2={(1,0,0),(1,1,1)},X3={(1,0,0),(1,1,0),(1,1,1)}.\begin{align*}X_1&=\{\text{any nonzero point}\},\\X_2&=\{(1,0,0),(1,1,1)\},\\X_3&=\{(1,0,0),(1,1,0),(1,1,1)\}.\end{align*}

It turns out that the decision problem is easier, thanks to a powerful theory developed in the late-2000's.

The pqr/uvw method

Recall that every symmetric polynomial P(x,y,z)P(x,y,z) can be written in terms of the elementary symmetric polynomials, given by p=x+y+z,q=xy+yz+zx,r=xyz.\begin{align*}p&=x+y+z,\\q&=xy+yz+zx,\\r&=xyz.\end{align*}Some people like to write p,q,rp,q,r as u,v2,w3u,v^2,w^3 instead, presumably to track degrees, but of course this is purely an aesthetic choice.

Much of the power of rewriting symmetric inequalities in terms of p,q,rp,q,r lies in the following result, which gives simple conditions on the range of one of these variables in terms of the other two:

"Tejs's Theorem": Suppose that x,y,z0x,y,z\ge0.

  1. (rr-lemma) For fixed p,qp,q, the value of rr attains its maximum only when two of x,y,zx,y,z are equal, and its minimum only when two of x,y,zx,y,z are equal or one of x,y,zx,y,z is 0.
  2. (qq-lemma) For fixed p,rp,r, the value of qq attains its maximum or minimum only when two of x,y,zx,y,z are equal.
  3. (pp-lemma) For fixed q,rq,r, the value of pp attains its maximum or minimum only when two of x,y,zx,y,z are equal.

Proof sketch: Consider the cubic polynomial f(t)=t3pt2+qtrf(t)=t^3-pt^2+qt-r, which has three nonnegative real roots. If we fix p,qp,q, the question of "what is the maximum possible value of rr?" is equivalent to "how far can we push the graph of ff down such that it still has 3 nonnegative real roots?"; the answer is clearly "until we reach the local maximum of ff", at which point two of the roots become equal:

On the other hand, "what is the minimum possible value of rr?" is equivalent to "how far can we push the graph of ff up such that it still has 3 nonnegative real roots?", and the answer is "until either (a) we reach the local minimum of ff (at which point two of the roots become equal), or (b) one of the roots becomes 0":

This proves the rr-lemma.

For the qq-lemma and the pp-lemma, if r=0r=0 then one of x,y,zx,y,z is 0, and it's easy to handle this case separately. Otherwise, we apply the same argument to f(t)=t3pt2+qtrtf(t)=\dfrac{t^3-pt^2+qt-r}t and f(t)=t3pt2+qtrt2f(t)=\dfrac{t^3-pt^2+qt-r}{t^2}, respectively: now we have r>0r>0 implies f(0)=f(0)=-\infty, so case (b) doesn't happen, and we are done. \square

Exercise: Make the above argument rigorous. Hint: if a cubic polynomial ff has 3 real roots, then its derivative has two real roots αβ\alpha\le\beta by Rolle's theorem; hence ff is increasing on (,α)(-\infty,\alpha), decreasing on (α,β)(\alpha,\beta), and increasing on (β,)(\beta,\infty). Now consider f1f^{-1} on each of these intervals.

For sufficiently low degree, this restricts where homogeneous symmetric polynomials can attain their extrema:

Corollary: Let P(x,y,z)P(x,y,z) be a homogeneous symmetric polynomial of degree 5\le5. Then P(x,y,z)0P(x,y,z)\ge0 holds for all x,y,z0x,y,z\ge0 if and only if it holds at (x,y,z)=(t,1,1)(x,y,z)=(t,1,1) and (t,1,0)(t,1,0) for all t0t\ge0.

Proof: By the rr-lemma, for fixed values of p,qp,q the extrema of rr is attained at points of the form (x,y,y)(x,y,y) or (x,y,0)(x,y,0). But by the degree condition, we see that PP written as a polynomial in p,q,rp,q,r must be linear in rr, so the same is true for the extrema of PP; hence to check that PP is nonnegative it suffices to check points of those two forms. By homogeneity we can divide these points by yy, so it suffices to check points of the form (t,1,1)(t,1,1) or (t,1,0)(t,1,0), as desired. \square

This corollary solves the decision problem for homogeneous symmetric inequalities in 3 nonnegative real variables of degree up to 5: any such inequality can be proven by checking two inequalities in a single variable t0t\ge0, which is easy with calculus. (This is why you don't see such problems in olympiads any more.)

Aside: History of pqr/uvw

The uvw method has probably been independently rediscovered multiple times.

Convex cones in finite dimensions

We now turn to the generation problem, which asks for a minimal set of polynomials in Pd+\mathcal P^+_d which generates all the other ones by nonnegative linear combinations. First, we recall some helpful results from convex analysis.

A closed set CRnC\subseteq\mathbb R^n is called a convex cone if any nonnegative linear combination of elements of CC is also in CC; in other words, for all viCv_i\in C and ai0a_i\ge0 we have iaiviC\sum_ia_iv_i\in C. Examples in R3\mathbb R^3 include the circular cone {x2+y2z2}\{x^2+y^2\le z^2\} and the first octant {x,y,z0}\{x,y,z\ge0\}.

Exercise: Verify that Pd+\mathcal P^+_d is a convex cone inside some finite-dimensional real vector space.

Given a convex cone CC, we're interested in its "outermost" points, namely those that are not nonnegative linear combinations of other points in CC (except trivially in terms of multiples of itself). For any nonzero vRnv\in\mathbb R^n, we write v\langle v\rangle for the ray {λv:λ0}\{\lambda v\,:\,\lambda\ge0\}; a ray vC\langle v\rangle\subseteq C is called an extremal ray if any line segment in CC which intersects v\langle v\rangle actually lies entirely inside v\langle v\rangle. For example, every ray (cosθ,sinθ,1)\langle(\cos\theta,\sin\theta,1)\rangle contained in the boundary of the circular cone is an extremal ray; on the other hand, the first octant only has 3 extremal rays, namely the nonnegative xx-, yy- and zz-axes.

It is almost a true statement that convex cones are the convex hulls of their extremal rays. (The key idea we used in the previous post was morally equivalent to this statement, for the cone of convex functions.)

Exercises:

  1. Verify that the circular cone and the first octant are each equal to the convex hull of their respective extremal rays.
  2. Determine the set of extremal rays of the following sets in R3\mathbb R^3: (a) the half-space {x0}\{x\ge0\}; (b) the quadrant {x,y0}\{x,y\ge0\}. Verify that neither set is the convex hull of their extremal rays.

Actually, we only need one more mild condition for the statement to be true. We only briefly sketch the idea of the proof here; the interested reader can find details in Rockafellar's Convex Analysis text.

Theorem (Rockafellar, Corollary 18.5.2): Let C{0}C\ne\{0\} be a convex cone in Rn\mathbb R^n which contains no lines. Then CC is the convex hull of its extremal rays.

Proof sketch: If dimC1\dim C\le1 then CC is a ray, and the result is obvious, so assume that dimC2\dim C\ge2. Then CC contains no lines implies that CC is not all of a subspace of Rn\mathbb R^n, so its relative boundary DD is nonempty. (Here "relative" means "relative to the minimal affine space containing CC".) Similarly, CC is not a half-space of a subspace, which implies that DD is not convex. Thus there is some line \ell which intersects CC in a compact interval, which implies that every line parallel to \ell intersects CC in a compact (possibly empty) interval. Hence every point in the relative interior of CC is a convex combination of two points in DD, and we finish by induction on dimension. \square

Exercise: Verify that Pd+\mathcal P^+_d contains no lines.

Hence for the generation problem, it suffices to determine the extremal rays of the convex cone Pd+\mathcal P^+_d, and take Sd\mathcal S_d to consist of one nonzero polynomial taken from each extremal ray. Which polynomials lie on the extremal rays of this cone?

Characterising extremal rays of Pd+\mathcal P^+_d

Recall that P\langle P\rangle is an extremal ray of Pd+\mathcal P^+_d if there are no nontrivial line segments in Pd+\mathcal P^+_d which contain PP in its interior (the trivial line segments being the ones contained in P\langle P\rangle). As such, we should really try to understand line segments in Pd+\mathcal P^+_d: given polynomials P0,P1Pd+P_0,P_1\in\mathcal P^+_d which are not multiples of each other, what can we say about the polynomials in the family Pλ=(1λ)P0+λP1P_\lambda=(1-\lambda)P_0+\lambda P_1 for λ(0,1)\lambda\in(0,1)?

Well, the most natural thing to do is to fix a point (x,y,z)(x,y,z) and look at the values Pλ(x,y,z)=(1λ)a+λbP_\lambda(x,y,z)=(1-\lambda)a+\lambda b, where a=P0(x,y,z)a=P_0(x,y,z), b=P1(x,y,z)b=P_1(x,y,z), with a,b0a,b\ge0. There's not much we can say about this value, except if a=b=0a=b=0 then Pλ(x,y,z)=0P_\lambda(x,y,z)=0 for all λ\lambda. In fact, the converse also holds: if (1λ)a+λb=0(1-\lambda)a+\lambda b=0 for some λ(0,1)\lambda\in(0,1), then a=b=0a=b=0. So all polynomials in the interior of a line segment have the same set of roots, which is the intersection of the set of roots of the polynomials at the two endpoints.

Thus if P\langle P\rangle is an extremal ray of Pd+\mathcal P^+_d, what that means is something like "the only polynomials in Pd+\mathcal P^+_d with all the roots of PP are multiples of PP", i.e., the set of roots of PP is maximal among nonzero polynomials in Pd+\mathcal P^+_d.

There is a technical issue with formalising the above intuitive argument: we need to keep track of the multiplicities of the roots, and for polynomials of more than one variable I'd have to say words I don't understand, like "divisors" and "projective". Fortunately, recall that for d5d\le5 we can restrict to polynomials of one variable to determine membership in Pd+\mathcal P^+_d, and those are much easier to deal with.

Lemma: Let P0,P1:[a,b]RP_0,P_1:[a,b]\to\mathbb R be polynomials such that P0,P10P_0,P_1\ge0 on [a,b][a,b], and write Pλ=(1λ)P0+λP1P_\lambda=(1-\lambda)P_0+\lambda P_1. Let Z()Z(\cdot) denote the set of roots of a polynomial on [a,b][a,b], counting multiplicity. Then:

  1. Z(Pλ)=Z(P0)Z(P1)Z(P_\lambda)=Z(P_0)\cap Z(P_1) for all 0<λ<10<\lambda<1; and
  2. If Z(P0)Z(P1)Z(P_0)\subseteq Z(P_1), then there exists ε>0\varepsilon>0 such that Pε0P_{-\varepsilon}\ge0 on [a,b][a,b].

Exercise: Prove the above lemma. Hint: all roots in (a,b)(a,b) must appear with even multiplicity; hence R=gcd(P0,P1)R=\gcd(P_0,P_1) doesn't change sign in [a,b][a,b]. Now we may divide P0P_0 and P1P_1 by RR, and assume that Z(P0)Z(P1)=Z(P_0)\cap Z(P_1)=\emptyset; the rest is easy. \square

For d5d\le5, we know by pqr/uvw that any homogeneous symmetric PP of degree dd lies in Pd+\mathcal P^+_d if and only if the two polynomialsf(t)3P(t,1,1),g(t)3P(t,1,0),\begin{align*}f(t)&\coloneqq3P(t,1,1),\\g(t)&\coloneqq3P(t,1,0),\end{align*}are nonnegative on [0,)[0,\infty). (The factor of 3 is just to get rid of some annoying constants later.) Now we can almost give a characterisation of the extremal rays of Pd+\mathcal P^+_d in terms of the roots of ff and gg by using the previous lemma, but [0,)[0,\infty) is not a compact interval. This isn't a big issue, sincef(t)=0    P(1,t1,t1)=0,g(t)=0    P(t1,1,0)=0.\begin{align*}f(t)=0&\iff P(1,t^{-1},t^{-1})=0,\\g(t)=0&\iff P(t^{-1},1,0)=0.\end{align*}So just for the sake of the next result, defineL={(t,1,1):0t1}{(1,t,t):0t1}{(t,1,0):0t1}.\begin{align*}\mathcal L=\{(t,1,1)\,&:\,0\le t\le1\}\\{}\cup\{(1,t,t)\,&:\,0\le t\le1\}\\{}\cup\{(t,1,0)\,&:\,0\le t\le1\}.\end{align*}Then PPd+P\in\mathcal P^+_d if and only if P0P\ge0 on L\mathcal L. Let ZL(P)Z_{\mathcal L}(P) be the set of roots of PP viewed as a 1-variable polynomial when restricted to each of the three line segments in L\mathcal L, counted with multiplicity.

Proposition: Let d5d\le5. For any PPd+{0}P\in\mathcal P^+_d\setminus\{0\}, we have that P\langle P\rangle is an extremal ray of Pd+\mathcal P^+_d if and only if ZL(P)Z_{\mathcal L}(P) is maximal among polynomials in Pd+{0}\mathcal P^+_d\setminus\{0\}.

Proof: ()(\Leftarrow) If P\langle P\rangle is not an extremal ray, then PP is contained in some line segment [Q,R]Pd+[Q,R]\subseteq\mathcal P^+_d which is not contained in P\langle P\rangle; in particular, Q0Q\ne0. Now since Pd+\mathcal P^+_d contains no lines, this line segment when extended must leave Pd+\mathcal P^+_d; without loss of generality we may assume that this happens at QQ, i.e., there does not exist ε>0\varepsilon>0 such that (1+ε)QεRPd+(1+\varepsilon)Q-\varepsilon R\in\mathcal P^+_d.

By (1) of the previous lemma, we have ZL(P)ZL(Q)Z_{\mathcal L}(P)\subseteq Z_{\mathcal L}(Q); by (2) of the lemma, we have ZL(P)ZL(Q)Z_{\mathcal L}(P)\ne Z_{\mathcal L}(Q). Hence ZL(P)Z_{\mathcal L}(P) is not maximal in Pd+{0}\mathcal P^+_d\setminus\{0\}.

()(\Rightarrow) If ZL(P)Z_{\mathcal L}(P) is not maximal in Pd+{0}\mathcal P^+_d\setminus\{0\}, then there exists QPd+{0}Q\in\mathcal P^+_d\setminus\{0\} such that ZL(P)ZL(Q)Z_{\mathcal L}(P)\varsubsetneq Z_{\mathcal L}(Q); in particular, QQ is not a multiple of PP. Now by (2) of the lemma, there exists ε>0\varepsilon>0 such that R=(1+ε)PεQ0R=(1+\varepsilon)P-\varepsilon Q\ge0 on [a,b][a,b]. Hence PP lies on the line segment [Q,R][Q,R] which is not contained in P\langle P\rangle, so P\langle P\rangle is not an extremal ray. \square

Finally, let's translate ZL(P)Z_{\mathcal L}(P) back in terms of the roots of ff and gg. For example, since P(1,t,t)=tdP(t1,1,1)=tdf(t1)P(1,t,t)=t^dP(t^{-1},1,1)=t^df(t^{-1}), we have:

  • For 0<ρ<10<\rho<1, the multiplicity of ρ\rho as a root of P(1,t,t)P(1,t,t) is equal to the multiplicity of ρ1\rho^{-1} as a root of ff;
  • For ρ=0\rho=0, the multiplicity of 0 as a root of P(1,t,t)P(1,t,t) is equal to ddegfd-\deg f. In other words, the "degree deficit" of ff can be viewed as "roots of ff at \infty".

Hence we will define Zf=Zf,PZ_f=Z_{f,P} to be the set of roots of ff counted with multiplicity, together with \infty with multiplicity ddegfd-\deg f; define ZgZ_g similarly. Now we have the characterisation of the extremal rays of Pd+\mathcal P^+_d in terms of ff and gg like we wanted:

Theorem: For d5d\le5 and PPd+{0}P\in\mathcal P^+_d\setminus\{0\}, we have P\langle P\rangle is an extremal ray of Pd+\mathcal P^+_d if and only if (Zf,P,Zg,P)(Z_{f,P},Z_{g,P}) is inclusion-maximal among polynomials in Pd+{0}\mathcal P^+_d\setminus\{0\}. \square

In other words, for any QPd+{0}Q\in\mathcal P^+_d\setminus\{0\}, if (Zf,P,Zg,P)(Zf,Q,Zg,Q)(Z_{f,P},Z_{g,P})\subseteq(Z_{f,Q},Z_{g,Q}) then (Zf,P,Zg,P)=(Zf,Q,Zg,Q)(Z_{f,P},Z_{g,P})=(Z_{f,Q},Z_{g,Q}). (Here we're abusing notation and writing (A,B)(C,D)(A,B)\subseteq(C,D) to mean the pair of inclusions ACA\subseteq C and BDB\subseteq D.)

Remark: It's possible a priori that f0f\equiv0 even if P≢0P\not\equiv0; in this case, we should consider every point in [0,][0,\infty] as a root of ff with infinite multiplicity, and we write Zf=🙏Z_f=🙏 in this case; similarly for gg.

Almost maximal subsets of roots

We can think of the above theorem as saying: if we restrict Pd+\mathcal P^+_d by specifying some roots, until we're left with a 1-dimensional subset, then what we're left with is an extremal ray of Pd+\mathcal P^+_d, and the set of roots that we have specified is maximal.

What happens if the set of roots is almost maximal, and we're left with a 2-dimensional subset instead? Geometrically, any convex cone in 2 dimensions which contains no lines must be a sector in the plane, generated by nonnegative linear combinations of the two rays which bound it; so we should expect that these are precisely the two extremal rays of Pd+\mathcal P^+_d in this subset.

As such, we will conclude this section with the following result, which will help us in our quest to do as little work as possible when we go back to actually solving the generation problem.

Theorem: For d5d\le5, let S,TS,T be multisets with elements in [0,][0,\infty], and letΩ={PPd+:(S,T)(Zf,P,Zg,P)}.\Omega=\{P\in\mathcal P^+_d\,:\,(S,T)\subseteq(Z_{f,P},Z_{g,P})\}.

  1. If dimΩ=1\dim\Omega=1, then Ω=P\Omega=\langle P\rangle is an extremal ray of Pd+\mathcal P^+_d.
  2. If dimΩ=2\dim\Omega=2, and P,QΩP,Q\in\Omega are given such that neither of (Zf,P,Zg,P)(Z_{f,P},Z_{g,P}) and (Zf,Q,Zg,Q)(Z_{f,Q},Z_{g,Q}) contains the other, then P\langle P\rangle and Q\langle Q\rangle are exactly the extremal rays of Pd+\mathcal P^+_d which are contained in Ω\Omega.

Proof: (1): Clear from the definition of extremal rays, and the fact that Pd+\mathcal P^+_d contains no lines.

(2): Note that P,QP,Q are not multiples of each other, so every element of Ω\Omega is some linear combination λP+μQ\lambda P+\mu Q. By the given condition, there exists (x,y,z)(x,y,z) which is a root of PP but not of QQ, so (λP+μQ)(x,y,z)0(\lambda P+\mu Q)(x,y,z)\ge0 implies that μ0\mu\ge0. Similarly, we have λ0\lambda\ge0. HenceΩ={λP+μQ:λ,μ0}.\Omega=\{\lambda P+\mu Q\,:\,\lambda,\mu\ge0\}.Now any element of Ω\Omega not in either P\langle P\rangle or Q\langle Q\rangle must have λ,μ>0\lambda,\mu>0, so that(Zf,λP+μQ,Zg,λP+μQ)=(Zf,P,Zg,P)(Zf,Q,Zg,Q),(Z_{f,\lambda P+\mu Q},Z_{g,\lambda P+\mu Q})=(Z_{f,P},Z_{g,P})\cap(Z_{f,Q},Z_{g,Q}),which is strictly smaller than either (Zf,P,Zg,P)(Z_{f,P},Z_{g,P}) or (Zf,Q,Zg,Q)(Z_{f,Q},Z_{g,Q}). Hence both PP and QQ have maximal zero sets, so P\langle P\rangle and Q\langle Q\rangle are both extremal rays of Pd+\mathcal P^+_d. \square

Determining extremal rays of Pd+\mathcal P^+_d

To summarise: Pd+\mathcal P^+_d is generated by nonnegative linear combinations of its extremal rays, which are in turn the polynomials with maximal zero sets (Zf,Zg)(Z_f,Z_g). The game plan now is:

  1. Specify enough roots until the space of possible polynomials has dimension 2\le2;
  2. Construct examples in that space with maximal zero sets;
  3. Repeat until we exhaust all possibilities.

First, let's check that we get the same answer as before in the degree 3 case.

Degree 3

Recall that that every homogeneous symmetric polynomial of degree 3 can be written asP(x,y,z)=K[3,0,0]+A[2,1,0]+B[1,1,1].P(x,y,z)=K[3,0,0]+A[2,1,0]+B[1,1,1].Now we compute f(t)=3P(t,1,1)f(t)=3P(t,1,1) and g(t)=3P(t,1,0)g(t)=3P(t,1,0):f(t)=K(t3+2)+A(t2+t+1)+B(3t),g(t)=K(t3+1)+At2+t2.\begin{align*}f(t)&=K(t^3+2)+A(t^2+t+1)+B(3t),\\g(t)&=K(t^3+1)+A\frac{t^2+t}2.\end{align*} Exercise:

  • Show that f0    K=A=B=0f\equiv0\iff K=A=B=0 (i.e., P0P\equiv0). (Hence ff completely determines K,A,BK,A,B, and therefore PP.)
  • Show that g0    K=A=0g\equiv0\iff K=A=0 (i.e., P=[1,1,1]\langle P\rangle=\langle[1,1,1]\rangle).

To prove facts about ZfZ_f and ZgZ_g, we should first think about characterising what cubic polynomials can appear as ff or gg. Characterising gg is easier: we haveg(t)=P(t,1,0)=tdP(1,t1,0)=tdg(t1),g(t)=P(t,1,0)=t^dP(1,t^{-1},0)=t^dg(t^{-1}),so gg is a reciprocal polynomial.

To characterise ff, first notice that ff can't be an arbitrary cubic: the space of cubics is 4-dimensional, while the space of possible ff's is 3-dimensional. Hence we should find a constraint that ff must satisfy; after staring for a bit (for example by plotting the graphs of t3+2t^3+2, t2+t+1t^2+t+1 and 3t3t) we come up with the following.

Exercise: Show that f(1)=f(1)f'(1)=f(1). (We could verify this explicitly on t3+2t^3+2, t2+t+1t^2+t+1 and 3t3t, or observe that this is a consequence of Euler's theorem on homogeneous functions.)

Hence we can collect some basic properties of ZfZ_f and ZgZ_g:

Proposition:

  1. For 0<ρ<0<\rho<\infty, we have that ρ\rho has even multiplicity in ZfZ_f and ZgZ_g;
  2. 0Zf    {1,1}Zg0\in Z_f\iff\{1,1\}\subseteq Z_g;
  3. Zf    {0,}Zg\infty\in Z_f\iff\{0,\infty\}\subseteq Z_g;
  4. For 0ρ0\le\rho\le\infty, we have that ρ\rho and ρ1\rho^{-1} have the same multiplicity in ZgZ_g;
  5. If f≢0f\not\equiv0, then #Zfd\#Z_f\le d; similarly for gg.

Proof:

  1. This follows from the fact that ff and gg do not change sign at ρ\rho.
  2. Both 0Zf0\in Z_f and 1Zg1\in Z_g are equivalent to P(1,1,0)=0P(1,1,0)=0; now 1 must appear with even multiplicity in ZgZ_g, so 1Zg    {1,1}Zg1\in Z_g\iff\{1,1\}\subseteq Z_g.
  3. Both conditions are equivalent to P(1,0,0)=0P(1,0,0)=0.
  4. This follows from g(t)=tdg(t1)g(t)=t^dg(t^{-1}).
  5. Note that ff has degf\le\deg f roots on [0,)[0,\infty), and ddegfd-\deg f roots at \infty; similarly for gg. \square

Now there are many different ways of finding maximal zero sets with the above facts (I have typed out 4 different arguments while drafting this post), but let's stick to the game plan. Suppose that P\langle P\rangle is an extremal ray of P3+\mathcal P^+_3, with corresponding zero sets (Zf,Zg)(Z_f,Z_g). We split into the following cases, calculating the dimension of the set Ω\Omega of remaining polynomials for each case.

  • Case 1: ρZg\rho\in Z_g, where ρ0,1,\rho\ne0,1,\infty. Note that this implies {ρ,ρ,ρ1,ρ1}Zg\{\rho,\rho,\rho^{-1},\rho^{-1}\}\subseteq Z_g, which contradicts #Zgd=3\#Z_g\le d=3 unless g0g\equiv0. Hence dimΩ=1\dim\Omega=1, and Ω=P=[1,1,1]\Omega=\langle P\rangle=\langle[1,1,1]\rangle is an extremal ray.
  • Case 2: 0Zg    Zg    Zf0\in Z_g\iff\infty\in Z_g\iff\infty\in Z_f. From the formula for gg we get K=0K=0, so dimΩ=2\dim\Omega=2.
  • Case 3: 1Zg    0Zf1\in Z_g\iff0\in Z_f. From the formula for gg we get 2K+A=02K+A=0, so dimΩ=2\dim\Omega=2.
  • Case 4: ρZf\rho\in Z_f, where ρ0,1,\rho\ne0,1,\infty. Now P≢0P\not\equiv0 implies that f≢0f\not\equiv0, so we may assume that ff is monic. Note that ρ\rho must appear with even multiplicity in ZfZ_f, so ff has one of the following forms:f(t)=(tρ)2(tσ),orf(t)=(tρ)2.\begin{align*}f(t)&=(t-\rho)^2(t-\sigma),\quad\text{or}\\f(t)&=(t-\rho)^2.\end{align*}In the second case, f(1)=f(1)f'(1)=f(1) implies ρ=1\rho=1, contradiction. In the first case, we must have σ0\sigma\le0 (since ff changes sign at σ\sigma), and now taking the logarithmic derivative at 1 gives21ρ+11σ=f(1)f(1)=1.\frac2{1-\rho}+\frac1{1-\sigma}=\frac{f'(1)}{f(1)}=1.Hence 21ρ[0,1)\dfrac2{1-\rho}\in[0,1), so ρ<1\rho<-1, contradiction. Hence dimΩ=0\dim\Omega=0, i.e., Ω={0}\Omega=\{0\}.
  • Case 5: 1Zf1\in Z_f. From the formula of ff we get K+A+B=0K+A+B=0, so dimΩ=2\dim\Omega=2.
  • Case 6: None of the above. Then (Zf,Zg)=(,)(Z_f,Z_g)=(\emptyset,\emptyset), which any of our constructions later will show is not maximal.

Now to find all extremal rays of P3+\mathcal P^+_3, it suffices to construct two elements with neither zero set containing the other in each of Cases 2, 3 and 5! (Also notice that the significance of the points (1,1,1)(1,1,1), (1,1,0)(1,1,0) and (1,0,0)(1,0,0) naturally falls out of this case analysis.)

To finish, we list some members of P3+\mathcal P^+_3 that we already know:

  • S3[3,0,0]2[2,1,0]+[1,1,1]S_3\coloneqq[3,0,0]-2[2,1,0]+[1,1,1], by Schur;
  • [3,0,0][2,1,0][3,0,0]-[2,1,0], by AM-GM;
  • [2,1,0][1,1,1][2,1,0]-[1,1,1], by AM-GM;
  • [1,1,1][1,1,1] (trivial).

We compute ZfZ_f and ZgZ_g for each of these:PfgZfZgS3t(t1)2(t1)2(t+1){0,1,1}{1,1}[3,0,0][2,1,0](t1)2(t+1)(t212t+1)(t+1){1,1}[2,1,0][1,1,1](t1)212t(t+1){1,1,}{0,}[1,1,1]3t0{0,,}🙏\begin{matrix} P&f&g&Z_f&Z_g\\\hline S_3&t(t-1)^2&(t-1)^2(t+1)&\{0,1,1\}&\{1,1\}\\ [3,0,0]-[2,1,0]&(t-1)^2(t+1)&(t^2-\frac12t+1)(t+1)&\{1,1\}&\emptyset\\ [2,1,0]-[1,1,1]&(t-1)^2&\frac12t(t+1)&\{1,1,\infty\}&\{0,\infty\}\\ [1,1,1]&3t&0&\{0,\infty,\infty\}&🙏 \end{matrix}Clearly the second example doesn't have maximal zero set, so we can safely remove it from consideration (even if we hadn't noticed before that it's the sum of the first and third rows).

Now the relevant constructions for each case is:

  • Case 2: [2,1,0][1,1,1][2,1,0]-[1,1,1] and [1,1,1][1,1,1];
  • Case 3: S3S_3 and [1,1,1][1,1,1];
  • Case 5: S3S_3 and [2,1,0][1,1,1][2,1,0]-[1,1,1].

Thus our three examples are exactly the extremal rays in P3+\mathcal P^+_3.

Degree 4

We're now ready to march bravely onward to tackle the problem in degree 4. The general homogeneous symmetric polynomial of degree 4 isP(x,y,z)=K[4,0,0]+A[3,1,0]+B[2,2,0]+C[2,1,1],P(x,y,z)=K[4,0,0]+A[3,1,0]+B[2,2,0]+C[2,1,1],where again we have [4,0,0][3,1,0][2,2,0][2,1,1][4,0,0]\ge[3,1,0]\ge[2,2,0]\ge[2,1,1] by AM-GM. Now we evaluate ff and gg:f(t)=K(t4+2)+A(t3+t+1)+B(2t2+1)+C(t2+2t),g(t)=K(t4+1)+At3+t2+Bt2.\begin{align*}f(t)&=K(t^4+2)+A(t^3+t+1)+B(2t^2+1)+C(t^2+2t),\\g(t)&=K(t^4+1)+A\frac{t^3+t}2+Bt^2.\end{align*} Exercise:

  • Show that f0    K=A=B=C=0f\equiv0\iff K=A=B=C=0 (i.e., P0P\equiv0). (Hence ff completely determines K,A,B,CK,A,B,C, and therefore PP.)
  • Prove that f(1)=43f(1)f'(1)=\frac43f(1).
  • Show that g0    K=A=B=0g\equiv0\iff K=A=B=0 (i.e., P=[2,1,1]\langle P\rangle=\langle[2,1,1]\rangle).

As before, suppose that P\langle P\rangle is an extremal ray of P4+\mathcal P^+_4, with corresponding zero sets (Zf,Zg)(Z_f,Z_g). We split into the following cases, where Ω\Omega in each case denotes the set of remaining polynomials. (This time, we start with the cases avoiding the special points (1,1,1)(1,1,1), (1,1,0)(1,1,0) and (1,0,0)(1,0,0).)

  • Case 1: ρZg\rho\in Z_g, where ρ0,1,\rho\ne0,1,\infty. Note that this implies {ρ,ρ,ρ1,ρ1}Zg\{\rho,\rho,\rho^{-1},\rho^{-1}\}\subseteq Z_g, so g(t)g(t) is a multiple of (tρ)2(tρ1)2(t-\rho)^2(t-\rho^{-1})^2. Since the map from PP to gg has 1-dimensional kernel, and the space of possible gg is 1-dimensional, we have dimΩ=1+1=2\dim\Omega=1+1=2.
  • Case 2: ρZf\rho\in Z_f, where ρ0,1,\rho\ne0,1,\infty. We expect that the space of quartic polynomials satisfying the constraints f(1)=43f(1)f'(1)=\frac43f(1) and f(ρ)=f(ρ)=0f(\rho)=f'(\rho)=0 has dimension 53=25-3=2; indeed, writing f(t)=(tρ)2f~(t)f(t)=(t-\rho)^2\tilde f(t), we see that f~\tilde f is a polynomial of degree 2\le2 satisfyingf(1)f(1)=21ρ+f~(1)f~(1),\frac{f'(1)}{f(1)}=\frac2{1-\rho}+\frac{\tilde f'(1)}{\tilde f(1)},i.e., f~(1)=(4321ρ)f~(1)\tilde f'(1)=\left(\frac43-\frac2{1-\rho}\right)\tilde f(1). Clearly the space of such f~\tilde f, and hence the space of ff, has dimension 2; since the map from PP to ff is invertible, we also have dimΩ=2\dim\Omega=2.

Now it turns out that the remaining conditions, namely

  • 0Zg    Zg    Zf0\in Z_g\iff\infty\in Z_g\iff\infty\in Z_f;
  • 1Zg    0Zf1\in Z_g\iff0\in Z_f; and
  • 1Zf1\in Z_f,

each yield only one nontrivial condition on PP, so that dimΩ=3\dim\Omega=3 in each case. To overcome this, we have to impose them together pairwise, or with higher multiplicity:

  • Case 3: {0,1}Zf\{0,1\}\subseteq Z_f. We have f(t)=t(t1)2f~(t)f(t)=t(t-1)^2\tilde f(t) for some f~\tilde f with degree 1\le1; hence dimΩ=2\dim\Omega=2.
  • Case 4: {0,}Zf\{0,\infty\}\subseteq Z_f. We have f(t)=tf~(t)f(t)=t\tilde f(t) for some f~\tilde f with degree 2\le2, andf(1)f(1)=11+f~(1)f~(1)\frac{f'(1)}{f(1)}=\frac11+\frac{\tilde f'(1)}{\tilde f(1)}implies f~(1)=13f~(1)\tilde f'(1)=\frac13\tilde f(1); hence dimΩ=2\dim\Omega=2.
  • Case 5: {1,}Zf\{1,\infty\}\subseteq Z_f. We have f(t)=(t1)2f~(t)f(t)=(t-1)^2\tilde f(t) for some f~\tilde f with degree 1\le1; hence dimΩ=2\dim\Omega=2.
  • Case 6: {0,0}Zf\{0,0\}\subseteq Z_f. We have f(t)=t2f~(t)f(t)=t^2\tilde f(t) for some f~\tilde f with degree 2\le2, andf(1)f(1)=21+f~(1)f~(1)\frac{f'(1)}{f(1)}=\frac21+\frac{\tilde f'(1)}{\tilde f(1)}implies f~(1)=23f~(1)\tilde f'(1)=-\frac23\tilde f(1); hence dimΩ=2\dim\Omega=2.
  • Case 7: {1,1,1,1}Zf\{1,1,1,1\}\subseteq Z_f. Then f(t)f(t) is a multiple of (t1)4(t-1)^4, so dimΩ=1\dim\Omega=1.
  • Case 8: {,}Zf\{\infty,\infty\}\subseteq Z_f. Then ff has degree 2\le2 and satisfies f(1)=43f(1)f'(1)=\frac43f(1), so dimΩ=2\dim\Omega=2.
  • Case 9: None of the above. Then (Zf,Zg)(Z_f,Z_g) is one of ({0},{1})(\{0\},\{1\}), ({1,1},)(\{1,1\},\emptyset), ({},{0,})(\{\infty\},\{0,\infty\}), or (,)(\emptyset,\emptyset), each of which our constructions later will show is not maximal.

Okay, that's a lot of construction to do. We list down some members of P4+\mathcal P^+_4 that we already know; just like in the degree 3 case, one of the AM-GM ones is redundant (and is omitted from this table): PfgZfZgS4t2(t1)2(t1)2(t2+t+1){0,0,1,1}{1,1}[3,1,0][2,2,0]t(t1)212t(t1)2{0,1,1,}{0,1,1,}[2,2,0][2,1,1](t1)2t2{1,1,,}{0,0,,}[2,1,1]t(t+2)0{0,,}🙏\begin{matrix} P&f&g&Z_f&Z_g\\\hline S_4&t^2(t-1)^2&(t-1)^2(t^2+t+1)&\{0,0,1,1\}&\{1,1\}\\ [3,1,0]-[2,2,0]&t(t-1)^2&\frac12t(t-1)^2&\{0,1,1,\infty\}&\{0,1,1,\infty\}\\ [2,2,0]-[2,1,1]&(t-1)^2&t^2&\{1,1,\infty,\infty\}&\{0,0,\infty,\infty\}\\ [2,1,1]&t(t+2)&0&\{0,\infty,\infty\}&🙏 \end{matrix}

Let's keep track of which cases these constructions work for:CaseConditionConstructions1ρZg,ρ0,1,[2,1,1],?2ρZf,ρ0,1,?,?3{0,1}ZfS4,[3,1,0][2,2,0]4{0,}Zf[3,1,0][2,2,0],[2,1,1]5{1,}Zf[3,1,0][2,2,0],[2,2,0][2,1,1]6{0,0}ZfS4,?7{1,1,1,1}Zf?8{,}Zf[2,2,0][2,1,1],[2,1,1]\begin{matrix} \text{Case}&\text{Condition}&\text{Constructions}\\\hline 1&\rho\in Z_g,\,\rho\ne0,1,\infty&[2,1,1],\,?\\ 2&\rho\in Z_f,\,\rho\ne0,1,\infty&?,\,?\\ 3&\{0,1\}\subseteq Z_f&S_4,\,[3,1,0]-[2,2,0]\\ 4&\{0,\infty\}\subseteq Z_f&[3,1,0]-[2,2,0],\,[2,1,1]\\ 5&\{1,\infty\}\subseteq Z_f&[3,1,0]-[2,2,0],\,[2,2,0]-[2,1,1]\\ 6&\{0,0\}\subseteq Z_f&S_4,\,?\\ 7&\{1,1,1,1\}\subseteq Z_f&?\\ 8&\{\infty,\infty\}\subseteq Z_f&[2,2,0]-[2,1,1],\,[2,1,1] \end{matrix}

Looks like we'll need some families of polynomials in P4+\mathcal P^+_4 such that ff and gg can have general roots ρ0,1,\rho\ne0,1,\infty. Are there any other easy constructions for polynomials in P4+\mathcal P^+_4?

One idea is to just square some quadratic polynomial, sayPλ4A(x,y,z)13(x2+y2+z2λ(xy+yz+zx))2.P^{4A}_\lambda(x,y,z)\coloneqq\frac13(x^2+y^2+z^2-\lambda(xy+yz+zx))^2.For this polynomial, we havef(t)=(t2+2λ(2t+1))2,g(t)=(t2+1λt)2.\begin{align*}f(t)&=(t^2+2-\lambda (2t+1))^2,\\g(t)&=(t^2+1-\lambda t)^2.\end{align*}Can ff and gg have any root ρ0,1,\rho\ne0,1,\infty for an appropriate choice of λ\lambda? Sure: just take λ=ρ2+22ρ+1\lambda=\dfrac{\rho^2+2}{2\rho+1} and λ=ρ2+1ρ\lambda=\dfrac{\rho^2+1}\rho, respectively.

Sketching these two curves, we see that Pλ4AP^{4A}_\lambda has the following zero sets for various values of λ\lambda:ZfZgλ=1{1,1,1,1}1<λ<2{ρ,ρ,ρ,ρ}(0<ρ<1<ρ<4)λ=2{0,0,4,4}{1,1,1,1}λ>2{ρ,ρ}(ρ>4){ρ,ρ,ρ1,ρ1}(ρ>1)\begin{matrix} &Z_f&Z_g\\\hline \lambda=1&\{1,1,1,1\}&\emptyset\\ 1<\lambda<2&\begin{gathered}\{\rho,\rho,\rho',\rho'\}\\[-0.5em]{\small(0<\rho'<1<\rho<4)}\end{gathered}&\emptyset\\ \lambda=2&\{0,0,4,4\}&\{1,1,1,1\}\\ \lambda>2&\begin{gathered}\{\rho,\rho\}\\[-0.5em]{\small(\rho>4)}\end{gathered}&\begin{gathered}\{\rho,\rho,\rho^{-1},\rho^{-1}\}\\[-0.5em]{\small(\rho>1)}\end{gathered} \end{matrix}So these constructions are good for many of our incomplete cases:CaseConditionConstructions1ρZg,ρ0,1,[2,1,1],Pλ4A (λ>2)2ρZf,ρ0,1,Pλ4A (λ>1),?6{0,0}ZfS4,P24A7{1,1,1,1}ZfP14A\begin{matrix} \text{Case}&\text{Condition}&\text{Constructions}\\\hline 1&\rho\in Z_g,\,\rho\ne0,1,\infty&[2,1,1],\,P^{4A}_\lambda\ {\small(\lambda>2)}\\ 2&\rho\in Z_f,\,\rho\ne0,1,\infty&P^{4A}_\lambda\ {\small(\lambda>1)},\,?\\ 6&\{0,0\}\subseteq Z_f&S_4,\,P^{4A}_2\\ 7&\{1,1,1,1\}\subseteq Z_f&P^{4A}_1 \end{matrix}

We're now left to construct a polynomial in P4+\mathcal P^+_4 which has roots at the point (ρ,1,1)(\rho,1,1) (and thus also at (1,ρ,1)(1,\rho,1) and (1,1,ρ)(1,1,\rho), by symmetry), and also at some point that isn't a root of the corresponding Pλ4AP^{4A}_\lambda; let's choose (1,1,1)(1,1,1) here. (Of course, the correct thing to do is a dimension count to make sure that there is indeed a nonzero solution.) Again, maybe we want to construct this as the square of a quadratic to ensure positivity; what quadratic polynomial has roots at those 4 points?

This is easy if we take the quadratic as the product of two linear factors: note that the 4 points lie on the union of y=zy=z and y+z=(ρ+1)xy+z=(\rho+1)x; this suggests the polynomial (yz)2(y+z(ρ+1)x)2(y-z)^2(y+z-(\rho+1)x)^2, which is homogeneous of degree 4 but not symmetric. No worries, we can just take the cyclic sum: definePμ4B(x,y,z)13cyc(yz)2(y+zμx)2,P^{4B}_\mu(x,y,z)\coloneqq\frac13\sum_{\text{cyc}}(y-z)^2(y+z-\mu x)^2,which will have the 4 roots as desired. Finally it is easy to check that f(t)=(t1)2(t(μ1))2f(t)=(t-1)^2(t-(\mu-1))^2 in this case, so that Zf={1,1,μ1,μ1}Z_f=\{1,1,\mu-1,\mu-1\}, which completes our list of required constructions:CaseConditionConstructions2ρZf,ρ0,1,Pλ4A (λ>1),Pρ+14B\begin{matrix} \text{Case}&\text{Condition}&\text{Constructions}\\\hline 2&\rho\in Z_f,\,\rho\ne0,1,\infty&P^{4A}_\lambda\ {\small(\lambda>1)},\,P^{4B}_{\rho+1} \end{matrix}

Remark: Note that taking ρ=0,1\rho=0,1 (i.e., μ=1,2\mu=1,2) will overlap with some of our previous constructions; by comparing root sets and leading coefficients, it is easy to check that P14B=S4P^{4B}_1=S_4 and P24B=P14AP^{4B}_2=P^{4A}_1. Also, it's possible to make sense of ρ=\rho=\infty (i.e., μ=\mu=\infty) if we normalise correctly, sayP4Blimμμ2Pμ4B,P^{4B}_\infty\coloneqq\lim_{\mu\to\infty}\mu^{-2}P^{4B}_\mu,and by considering the root set and leading coefficient in this case we see that in fact P4B=[2,2,0][2,1,1]P^{4B}_\infty=[2,2,0]-[2,1,1].

Hence we have our solution to the generation problem for degree 4:

Theorem: The extremal rays of P4+\mathcal P^+_4 are:Pλ4A (1<λ<),Pμ4B (1<μ<),[3,1,0][2,2,0],[2,2,0][2,1,1],[2,1,1].\begin{matrix} &\langle P^{4A}_\lambda\rangle\ {\small(1<\lambda<\infty)},\\ &\langle P^{4B}_\mu\rangle\ {\small(1<\mu<\infty)},\\ &\langle[3,1,0]-[2,2,0]\rangle,\\ &\langle[2,2,0]-[2,1,1]\rangle,\\ &\langle[2,1,1]\rangle. \end{matrix}Any PP4+P\in\mathcal P^+_4 is a nonnegative linear combination of the polynomials listed above. \square

The only place in the literature where I could find this statement is in Ando (2022), Proposition 1.3, where it is deduced as a consequence of the (much harder) classification of extremal rays for nonnegative homogeneous cyclic polynomials of degree 4. (Coincidentally, Tetsuya Ando is the chair of the Problem Selection Committee for IMO 2023.) As far as I know, the proof in this post is original.

Exercises

Because there are so many more things to say that these deserve their own section.

Like, an actual inequality

  1. Solve in 30 seconds: Find the largest value of M>0M>0 such that the inequality(x+y+z)3M(x+yz)(y+zx)(z+xy)(x+y+z)^3\ge M(x+y-z)(y+z-x)(z+x-y)holds for all x,y,z0x,y,z\ge0.

Extremal rays of Pd+\mathcal P^+_d

  1. For each extremal ray P\langle P\rangle of P3+\mathcal P^+_3, express (x+y+z)P(x+y+z)P as a nonnegative linear combination of extremal rays of P4+\mathcal P^+_4.
  2. Consider the following attempted construction of an extremal ray in P4+\mathcal P^+_4: we check that for ρ>4\rho>4, the polynomial f(t)=t(tρ)2(tσ)f(t)=t(t-\rho)^2(t-\sigma) with 21ρ+11σ=13\frac2{1-\rho}+\frac1{1-\sigma}=\frac13satisfies σ<0\sigma<0 (so that f0f\ge0 on [0,)[0,\infty)) and f(1)=43f(1)f'(1)=\frac43f(1). Recovering PP from ff, we must have ({0,ρ,ρ},{1,1})(Zf,Zg)(\{0,\rho,\rho\},\{1,1\})\subseteq(Z_f,Z_g). So this is a third construction for Case 2, where we should have dimΩ=2\dim\Omega=2. What went wrong?
  3. Prove that if PPd+P\in\mathcal P^+_d and g(t)=P(t,1,0)0g(t)=P(t,1,0)\equiv0, then xyzPxyz\mid P.
    (Hint: we can assume that PP is a linear combination of polynomials of the form [k,dk,0][k,d-k,0].)
    Hence show that the following are extremal rays of Pd+\mathcal P^+_d:
    1. [k,k,k]\langle[k,k,k]\rangle, if d=3kd=3k;
    2. [k+1,k,k]\langle[k+1,k,k]\rangle, if d=3k+1d=3k+1;
    3. [k+1,k+1,k]\langle[k+1,k+1,k]\rangle and [k+2,k,k][k+1,k+1,k]\langle[k+2,k,k]-[k+1,k+1,k]\rangle, if d=3k+2d=3k+2.
  4. Let S6S_6 be the degree 6 Schur polynomial,S613x4(xy)(xz)=[6,0,0]2[5,1,0]+[4,1,1].\begin{align*}S_6&\coloneqq\frac13\sum x^4(x-y)(x-z)\\&=[6,0,0]-2[5,1,0]+[4,1,1].\end{align*}Show that S6\langle S_6\rangle is not an extremal ray of P6+\mathcal P^+_6.
    (Hint: we have fS6(t)=S6(t,1,1)=t4(t1)2f_{S_6}(t)=S_6(t,1,1)=t^4(t-1)^2; show that set of polynomials in P6+\mathcal P^+_6 satisfying the constraint {0,0,0,0,1,1}Zf\{0,0,0,0,1,1\}\subseteq Z_f has dimension 2, and find the two rays bounding this region. Note that since we no longer have d5d\le5, we can't immediately conclude that these rays are extremal rays of P6+\mathcal P^+_6.)

Root diagrams

It turns out that we can actually do even less work in degree 4, by drawing the correct diagrams.

  1. Show that the roots of any PP4+{0}P\in\mathcal P^+_4\setminus\{0\} can be represented by a symmetric subset of the following region:

    (Hint: take the normalisation x+y+z=1x+y+z=1; this is basically the same as the soil triangle.)

Note that:

  • The points (t,1,1)(t,1,1) are projected down to the y=zy=z line, which is a median/altitude/angle bisector of the triangle, with t=0,1,t=0,1,\infty at the foot, centre, and vertex, respectively;
  • The points (t,1,0)(t,1,0) are projected down to the z=0z=0 line, which is a side of the triangle, with t=0,1,t=0,1,\infty at the vertex, midpoint, and other vertex, respectively.

  1. Recall that Pλ4A(x,y,z)13(x2+y2+z2λ(xy+yz+zx))2P^{4A}_\lambda(x,y,z)\coloneqq\frac13(x^2+y^2+z^2-\lambda(xy+yz+zx))^2. Without using coordinate geometry, show that the roots of Pλ4AP^{4A}_\lambda are given by the following diagrams.

    (Hint: which conics have the same symmetries as the equilateral triangle?)
    Hence recover the table of the zero sets of Pλ4AP^{4A}_\lambda in each case (reproduced here):ZfZgλ=1{1,1,1,1}1<λ<2{ρ,ρ,ρ,ρ}(0<ρ<1<ρ<4)λ=2{0,0,4,4}{1,1,1,1}λ>2{ρ,ρ}(ρ>4){ρ,ρ,ρ1,ρ1}(ρ>1)\begin{matrix} &Z_f&Z_g\\\hline \lambda=1&\{1,1,1,1\}&\emptyset\\ 1<\lambda<2&\begin{gathered}\{\rho,\rho,\rho',\rho'\}\\[-0.5em]{\small(0<\rho'<1<\rho<4)}\end{gathered}&\emptyset\\ \lambda=2&\{0,0,4,4\}&\{1,1,1,1\}\\ \lambda>2&\begin{gathered}\{\rho,\rho\}\\[-0.5em]{\small(\rho>4)}\end{gathered}&\begin{gathered}\{\rho,\rho,\rho^{-1},\rho^{-1}\}\\[-0.5em]{\small(\rho>1)}\end{gathered} \end{matrix}

  2. Explain how the following diagram gives the construction for Pλ4BP^{4B}_\lambda:

Test sets

The following result is due to Timofte (2003); notice that this exactly recovers our previous results for n=3n=3 and d=3,4,5d=3,4,5.

Theorem (Half-degree principle): Let F(x1,,xn)F(x_1,\ldots,x_n) be a homogeneous symmetric polynomial of degree d2d\ge2 in nn variables. Then F0F\ge0 for all xi0x_i\ge0 if and only if F0F\ge0 on all points (x1,,xn)(x_1,\ldots,x_n) with xi0x_i\ge0 and at most d/2\lfloor d/2\rfloor distinct nonzero coordinates.

Here we present the simplified approach by Reiner (2012).

  1. Let φ(T)\varphi(T) be a degree nn polynomial with nn distinct positive real roots. Show that there exists ε>0\varepsilon>0 such that for all c[ε,εc\in[-\varepsilon,\varepsilon], the polynomial φ(T)+c\varphi(T)+c also has nn distinct nonnegative real roots.
  2. Let φ(T)\varphi(T) be a degree nn polynomial with nn nonnegative real roots, of which there are mm distinct positive roots. Show that there exists ε>0\varepsilon>0 and a polynomial ψ\psi of degree nmn-m such that for all c[ε,εc\in[-\varepsilon,\varepsilon], the polynomial φ(T)+cψ(T)\varphi(T)+c\psi(T) also has nn nonnegative real roots, of which there are >m>m distinct positive roots.
    (Hint: apply the previous result to φ/ψ\varphi/\psi for an appropriately chosen factor ψ\psi of φ\varphi.)
  3. Let ek(x1,,xn)e_k(x_1,\ldots,x_n) denote the elementary symmetric polynomial of degree kk. Recall that FF can be uniquely written as a polynomial in e1,,ene_1,\ldots,e_n; show that this polynomial is linear in the variables eke_k (d/2<kn\lfloor d/2\rfloor<k\le n) jointly.
  4. Show that the locus Ω\Omega of points (x1,,xn)(x_1,\ldots,x_n) with xi0x_i\ge0 determined by fixing the values of e1,,ed/2e_1,\ldots,e_{\lfloor d/2\rfloor} is compact.
  5. Suppose that the point (x1,,xn)(x_1,\ldots,x_n) with xi0x_i\ge0 is chosen in Ω\Omega minimising FF, with the maximal number mm of distinct positive components. By considering the polynomialφ(T)=i(Txi)=k=0n(1)kekTnk\varphi(T)=\prod_i(T-x_i)=\sum_{k=0}^n(-1)^ke_kT^{n-k}and applying (2), show that md/2m\le\lfloor d/2\rfloor. Conclude.

The decision problem

Recall that the decision problem requires us to find a minimal set XdX_d such that P0P\ge0 on XdX_d implies that P0P\ge0 for all x,y,z0x,y,z\ge0. This is not quite the right definition: for example, for d=4,5d=4,5, we can't actually take XdX_d to be L\mathcal L, since by continuity we only need to check points in L\mathcal L with rational coordinates.

Let's fix the definition: setΣ={(x,y,z):xyz0,x+y+z=1}.\Sigma=\{(x,y,z)\,:\,x\ge y\ge z\ge0,\,x+y+z=1\}.We now aim to find a minimal closed set XdΣ\mathcal X_d\subseteq\Sigma such that all homogeneous symmetric polynomials PP of degree dd which are nonnegative on Xd\mathcal X_d must also be nonnegative on all of Σ\Sigma (and hence for all x,y,z0x,y,z\ge0).

  1. Suppose that PPd+{0}P\in\mathcal P^+_d\setminus\{0\} has a unique root (x0,y0,z0)Σ(x_0,y_0,z_0)\in\Sigma. Show that any choice of Xd\mathcal X_d must contain (x0,y0,z0)(x_0,y_0,z_0).
    (Hint: Take Q=[d,0,0]Q=[d,0,0], so that Q>0Q>0 on Σ\Sigma; for any neighbourhood UΣU\subseteq\Sigma of (x0,y0,z0)(x_0,y_0,z_0), show that there exists ε>0\varepsilon>0 such that PεQP-\varepsilon Q is only negative inside of UU.)
  2. For each point (x0,y0,z0)(x_0,y_0,z_0) on the relative boundary of Σ\Sigma, construct PP4+{0}P\in\mathcal P^+_4\setminus\{0\} such that P=0P=0 only on multiples of (x0,y0,z0)(x_0,y_0,z_0). (This shows that X4\mathcal X_4 is essentially L\mathcal L; taking (x+y+z)P(x+y+z)P shows the same for X5\mathcal X_5.)
  3. For each point (x0,y0,z0)(x_0,y_0,z_0) in the relative interior of Σ\Sigma, construct PP6+{0}P\in\mathcal P^+_6\setminus\{0\} such that P=0P=0 only on multiples of (x0,y0,z0)(x_0,y_0,z_0).
    (Hint: consider the polynomial(x+yx0+y0zz0)(x+yy0+z0zx0)(x+yz0+x0zy0).)\left(\frac{x+y}{x_0+y_0}-\frac z{z_0}\right)\left(\frac{x+y}{y_0+z_0}-\frac z{x_0}\right)\left(\frac{x+y}{z_0+x_0}-\frac z{y_0}\right).)This shows that Xd\mathcal X_d has to be all of Σ\Sigma for d6d\ge6.

Degree 5

The approach described in this post can also be applied to classify the extremal rays of P5+\mathcal P^+_5, with enough determination. Here we outline the most important cases.

The general homogeneous symmetric polynomial of degree 5 isP(x,y,z)=K[5,0,0]+A[4,1,0]+B[3,2,0]+C[3,1,1]+D[2,2,1].P(x,y,z)=K[5,0,0]+A[4,1,0]+B[3,2,0]+C[3,1,1]+D[2,2,1].

  1. Prove the usual preliminary results about ff and gg:
    • the map from PP to ff is injective;
    • the map from PP to gg has kernel of dimension 2 (generated by [3,1,1][3,1,1] and [2,2,1][2,2,1]);
    • f(1)=53f(1)f'(1)=\frac53f(1).
  2. For distinct ρ,ρ0,1,\rho,\rho'\ne0,1,\infty, show that the set Ω\Omega of polynomials in P5+\mathcal P^+_5 with {ρ,ρ}Zf\{\rho,\rho'\}\subseteq Z_f has dimensiondimΩ={1,ρ[max(0,52ρρ+2),7ρ5ρ+1],0,otherwise.\dim\Omega=\begin{cases}1,&\rho'\in\left[\max\left(0,\frac{5-2\rho}{\rho+2}\right),\frac{7-\rho}{5\rho+1}\right],\\0,&\text{otherwise}.\end{cases}(Hint: f(t)=(tρ)2(tρ)2f~(t)f(t)=(t-\rho)^2(t-\rho')^2\tilde f(t) for some f~\tilde f of degree 1\le1; take the logarithmic derivative at 1 to restrict the possible values of 11ρ+11ρ\dfrac1{1-\rho}+\dfrac1{1-\rho'}.)
  3. For ρ0,1,\rho\ne0,1,\infty, show that the set of polynomials in P5+\mathcal P^+_5 with ρZf\rho\in Z_f has dimension 3, parametrised by (K,A,B)(K,A,B). Determine the set of these triples such that σZg\sigma\in Z_g for some 1<σ<1<\sigma<\infty. Hence determine the pairs (ρ,σ)(\rho,\sigma) such that the set Ω\Omega of polynomials in P5+\mathcal P^+_5 with (ρ,σ)(Zf,Zg)(\rho,\sigma)\in(Z_f,Z_g) has dimension 1.
    (Hint: show that g(t)=(t+1)t2h(t+t1)g(t)=(t+1)t^2h(t+t^{-1}) for some polynomial hh of degree 2\le2, and express the coefficients of hh as linear functions of K,A,BK,A,B.)
  4. For ρ0,1,\rho\ne0,1,\infty, show that the set Ω\Omega of polynomials in P5+\mathcal P^+_5 under each of the following constraints has dimension 1:
    • {ρ,0,1}Zf\{\rho,0,1\}\subseteq Z_f;
    • {ρ,0,}Zf\{\rho,0,\infty\}\subseteq Z_f;
    • {ρ,1,}Zf\{\rho,1,\infty\}\subseteq Z_f.

In each of the above cases, we can get some expression for PP by recovering it from ff, though it would be less interpretable than our constructions in degree 4.

  1. Finish the classification of the extremal rays of P5+\mathcal P^+_5. Check against Ando (2022), Theorem 1.4.

The dual cone

This was the original approach that I was going to write about when I first started on this post.

Degree 3

Recall the expressions P(x,y,z)=K[3,0,0]+A[2,1,0]+B[1,1,1],f(t)=K(t3+2)+A(t2+t+1)+B(3t),g(t)=K(t3+1)+At2+t2.\begin{align*}P(x,y,z)&=K[3,0,0]+A[2,1,0]+B[1,1,1],\\f(t)&=K(t^3+2)+A(t^2+t+1)+B(3t),\\g(t)&=K(t^3+1)+A\frac{t^2+t}2.\end{align*}Instead of analysing the cone of possible triples (K,A,B)(K,A,B) (which we identify with P3+\mathcal P^+_3 by abuse of notation), we can instead look at the dual cone P3\mathcal P^*_3 of triples (κ,α,β)(\kappa,\alpha,\beta) for whichκK+αA+βB0\kappa K+\alpha A+\beta B\ge0for all (K,A,B)P3+(K,A,B)\in\mathcal P^+_3. Geometrically, every point in P3\mathcal P^*_3 represents a supporting plane to P3+\mathcal P^+_3.

Since the only restrictions defining P3+\mathcal P^+_3 are given by f(t)0f(t)\ge0 and g(t)0g(t)\ge0 for all t0t\ge0, it should be clear that P3\mathcal P^*_3 is simply the cone generated by nonnegative linear combinations of the points (t3+2,t2+t+1,3t)(t^3+2,t^2+t+1,3t) and (t3+1,12(t2+t),0)(t^3+1,\frac12(t^2+t),0) for t0t\ge0.

  1. Show that the κ=1\kappa=1 cross-section of P3\mathcal P^*_3 is given by the convex hull C3\mathcal C_3 of the union of the two parametric curvesφ(t)(t2+t+1t3+2,3tt3+2),ψ(t)(t2+t2(t3+1),0),\begin{align*}\varphi(t)&\coloneqq\left(\frac{t^2+t+1}{t^3+2},\frac{3t}{t^3+2}\right),\\\psi(t)&\coloneqq\left(\frac{t^2+t}{2(t^3+1)},0\right),\end{align*}for t0t\ge0.
  2. Show that C3\mathcal C_3 is a triangle with vertices at (12,0)(\frac12,0), (1,1)(1,1), (0,0)(0,0), corresponding to the points (x,y,z)=(1,1,0)(x,y,z)=(1,1,0), (1,1,1)(1,1,1), (1,0,0)(1,0,0) respectively.

Now we have to figure out what extremal rays correspond to in the picture for the dual cone P3\mathcal P^*_3.

  1. Show that each point (K,A,B)P3+(K,A,B)\in\mathcal P^+_3 represents a supporting plane to P3\mathcal P^*_3; in the κ=1\kappa=1 cross-section, these are supporting lines to C3\mathcal C_3.
  2. Show that each line segment in P3+\mathcal P^+_3 corresponds to a family of planes in P3\mathcal P^*_3 passing through a fixed line; in the κ=1\kappa=1 cross-section, these are families of supporting lines to C3\mathcal C_3 passing through a common point.
  3. Hence show that each extremal ray of P3+\mathcal P^+_3 corresponds to a supporting line \ell of P3+\mathcal P^+_3 which must intersect C3\mathcal C_3 if you wiggle it while pivoting about any point on \ell.
  4. Show that the only such \ell are the three sides of C3\mathcal C_3.

Degree 4

We set up the dual cone P4\mathcal P^*_4 and its κ=1\kappa=1 cross-section C4\mathcal C_4 analogously.f(t)=K(t4+2)+A(t3+t+1)+B(2t2+1)+C(t2+2t),g(t)=K(t4+1)+At3+t2+Bt2.\begin{align*}f(t)&=K(t^4+2)+A(t^3+t+1)+B(2t^2+1)+C(t^2+2t),\\g(t)&=K(t^4+1)+A\frac{t^3+t}2+Bt^2.\end{align*}

  1. Show that the κ=1\kappa=1 cross-section of P4\mathcal P^*_4 is given by the convex hull C4\mathcal C_4 of the union of the two parametric curvesφ(t)(t3+t+1t4+2,2t2+1t4+2,t2+2tt4+2),ψ(t)(t3+t2(t4+1),t2t4+1,0),\begin{align*}\varphi(t)&\coloneqq\left(\frac{t^3+t+1}{t^4+2},\frac{2t^2+1}{t^4+2},\frac{t^2+2t}{t^4+2}\right),\\\psi(t)&\coloneqq\left(\frac{t^3+t}{2(t^4+1)},\frac{t^2}{t^4+1},0\right),\end{align*}for t0t\ge0.
  2. Show that the boundary of C4\mathcal C_4 consists of finitely many ruled surfaces.
  3. Show that extremal rays of P4+\mathcal P^+_4 correspond to segments lying on the boundary of C4\mathcal C_4, joining two points on the parametric curves.
  4. Complete the classification of extremal rays of P4+\mathcal P^+_4.

Open questions

  1. Using the method in this post, determine the extremal rays in the following families of polynomials which are nonnegative on nonnegative arguments:
    • Homogeneous cyclic polynomials of degree 3 in 3 variables (check against);
    • Homogeneous symmetric polynomials of degree 3 in 4 variables;
    • Any other family.
    Check against results in the literature, in particular the work of Tetsuya Ando, Vasile Cirtoaje, Mariyan and Nedelcho Milev, ...
  2. Determine the extremal rays of P6+\mathcal P^+_6.

Comments

Popular posts from this blog

SMO Open 2024 ??% Speedrun

Musings about the IMO