Dynamics with USATST '25 P2

(USATST '25 P2) Let $a_1,a_2,\dots,$ and $b_1,b_2,\dots,$ be sequences of real numbers for which $a_1 > b_1$ and $a_{n+1} = a_n^2-2b_n, b_{n+1}=b_n^2-2a_n$ for all positive integers $n$. Prove that $a_1,a_2,\dots,$ is eventually increasing (i.e. there exists a positive integer $N$ for which $a_k < a_{k+1}$ for all $k>N$).

(Dylan here.) I will be writing this post as I am trying to solve the above problem. Skip to section 6 for the tldr. 

0. What type of problem is this?

This is an algebra problem concerning dynamics of a function on the plane. 

Unraveling what this means: the function $(a_n,b_n)\mapsto (a_{n+1},b_{n+1})$ is a function $f:\mathbb R^2\to \mathbb R^2$ acting as $(x,y)\mapsto (x^2-2y,y^2-2x)$. Dynamics means we start at some initial point $P_0 = (x_0,y_0)$, and study the behaviour of the sequence of points $P_0,f(P_0), f(f(P_0)), \dots$: Does it go to infinity? Or converge to some point? Or eventually fall into a periodic cycle? The problem statement asks us to show that given some initial condition, the sequence will eventually "keep moving right". 

1. Initial observations and strategies

With dynamics problems, the easiest way to get a sense of what is happening is to plug a ton of initial values $P_0$ in and see how the sequence behaves. Of course, that is completely infeasible with pen and paper in a timed contest, so as a student, it's going to test your ability to get a sense of what is happening early (without having to actually compute many numbers). This means pulling things from existing intuition/past experience.

Let me first describe three broad strategies in dynamics that come to mind, then try to apply it to this specific problem.
  1. Typically, there is an asymptotic (i.e. "large") regime where things are well-behaved. For example, in the one-variable dynamics $g: \mathbb R\to \mathbb R, x\mapsto x^2-2x$, we know that $x>10 \implies g(x) > 10$. In fact, we know $x > 10 \implies g(x) > x$. So dynamics in the region $x>10$ is well-understood. 
  2. Fixed points are also special, and behaviour near them is well-understood. For example, in the one-variable dynamics $g: x\mapsto x^2-2x$, The fixed points $g(x) = x$ are $x=0,3$. So beginning at $3$, we have the constant sequence $3,3,3,3\dots$. What happens if we begin at $3.001$? Note $3+\varepsilon \mapsto (3+\varepsilon)^2-2(3+\varepsilon) \approx 3+4\varepsilon$, so the sequence is $3.001, \approx 3.004, \approx 3.016, \dots$ and will soon move to a regime where we can no longer treat $\varepsilon = x-3$ to be small. This is an example of divergent behaviour in a neighbourhood of the fixed point (but in other dynamical systems, we could see convergent/periodic behaviour too). 
  3. Symmetry, if present, is useful to observe, and helps in reducing the number of cases.
Now, time to brainstorm broad strategies to tackle this specific problem. 
  • The symmetry $f(x,y) = (x',y') \implies f(y,x) = (y',x')$ means that the dynamics is symmetric about interchanging $x\leftrightarrow y$. This also means that if I ever land on the diagonal, I'm staying on the diagonal, since $f(x,x) = (x^2-2x,x^2-2x)$. So it might be useful to solve the one-dimensional dynamics $g: x\mapsto x^2-2x$ first. 
  • There may also be fixed points outside the diagonal. Let's find them. (Actually, the problem statement guarantees that there can't be any; but we should try to solve for them anyway, as the computation might give us insight into why the problem statement is true.)
  • Let's also establish a region where the statement is true. Maybe when both $x,y$ are large. If possible, let's reduce our work to a certain finite region of the plane. This means we also need to figure out what happens when $x,y$ are negative.
  • Let's also try to investigate monovariants/invariants of various quantities. Maybe $a_n-b_n>0$ for all $n$? Maybe $a_{n+1}-a_n$ has a nice expression in itself?
  • Sometimes, $f(x,y)$ is not too clean, but $f(f(x,y))$ has better properties. Let's try to iterate more than once and see if it is useful.
[I will also be sloppy and use $>$ when sometimes I mean $\geq$.]

2. Dirty Groundwork

Let's tick off stuff I listed above, one by one. 

Diagonal dynamics:
  • $g: x\mapsto x^2-2x$ has fixed points $x=0,3$. And $x > 3\implies g(x)>x$, so the statement is true if $a_n = b_n>3$ for some $n$. 
  • Also, $x< -1\implies g(x)>3$, so it is also true if $a_n = b_n < - 1$ for some $n$. 
  • Before working out dynamics in the range $x\in (-1,3)$, let's see if we can even reach the diagonal from initial conditions $a_1 > b_1$.
    • Note $a_{n+1}-b_{n+1} = (a_n-b_n)(a_n+b_n+2)$, so in order to reach the diagonal from outside the diagonal, we must have had $a_n+b_n = -2$ for some $n$. 
    • But then $a_{n+1} = a_n^2-2b_n$, and $x^2-2y = x^2+2x-2(x+y) = (x+1)^2-1-2(x+y)$, so $x+y = -2\implies x^2-2y \geq 3$, equality iff $x=-1,y=-1$ (which is already on the diagonal). 
    • So, if we reached the diagonal from outside the diagonal, then we would be on $(x,x)$ for $x>3$, and then the statement is true. 
  • So for the purposes of the problem, we have settled diagonal dynamics. (Though if we run out of viable strategies later, we might return later to investigate diagonal dynamics in $x\in (-1,3)$, which should tell us about near-diagonal dynamics.)
Studying $a_n-b_n$:
  • We've seen $a_{n+1}-b_{n+1} = (a_n-b_n)(a_n+b_n+2)$, so maybe it'd be useful to know what $a_n+b_n$ is too. 
  • $a_n+b_n = a_{n-1}^2-2a_{n-1} + b_{n-1}^2-2b_{n-1}$. That's great: so $a_{n+1}-b_{n+1} = (a_n-b_n)((a_{n-1}-1)^2+(b_{n-1}-1)^2)$, so the difference stays the same sign once $n>2$!
  • Note this could still be $<0$, for instance if $a_1+b_1 < -2$. 
  • Maybe we'll split into the two cases: by Case I: $a_2>b_2$ and Case II: $a_2<b_2$ (plus the case where we eventually land on the diagonal, but we've resolved that already). So in the first case, we can freely assume $a_n > b_n$ for all $n$; and we can settle the second case later.
Large positive $(x,y)$:
  • It's not true that $x,y > N \implies x^2-2y,y^2-2x > N$. 
  • But, if we know $x>y$, then $x^2-2y > x^2-2x > x$ if $x > 3$. 
  • So if $a_2 > b_2$ and $a_2 > 3$, then $a_3 > b_3$ and $a_3 > a_2 > 3$. So we are done for Case I and $a_2>3$. 
Ok, so where we are now is: we are good if we hit the $\{x>y, x>3\}$ regime. And we know $a_2+b_2>-2$. Here's a small picture. 

3. Trying more things

Let's try to get an expression for $a_{n+1}-a_n$:
$$a_{n+1}-a_n = a_n^2 - a_{n-1}^2 - 2(b_n-b_{n-1})$$
Perhaps this is not so useful after all. 

Let's try to iterate: $f(f(x,y)) = ((x^2-2y)^2-2(y^2-2x),\dots) = (x^4-4x^2y+2y^2+4x,\dots)$ Hmm... doesn't help either. 

Let's try to reduce our problem to a bounded region of the $(a_2,b_2)$-plane. This means looking at Case II. So we begin with $a_1>b_1$ but $a_1+b_1<-2$, and iterate once. Now note $(a_2,b_2)\in \{x,y>3\}$, but $a_2<b_2$. What happens next?

Well, we want to eliminate the possibility of something like $(4,1000)$. This should come from the $a_1+b_1<-2$ bound. Let's see: [at this point I tried to compute $f(-1-t,-1+t) = ((-1-t)^2-2(-1+t),\dots) = (-1+t^2,-1+t^2)$. Then I realised the inequality doesn't come from this boundary. But anyway, it gave me an idea to express $(a_n,b_n)$ in terms of sum and difference $(s_n,d_n)$ instead.]

Let's express sum and difference: $s_{n+1} = \frac{s_n^2 + d_n^2}{2} - 2s_n$, $d_{n+1} = d_n(s_n+2)$. How about sum-plus-2? $t_{n+1} = \frac{(t_n-4)^2+d_n^2}{2}$, $d_{n+1} = d_nt_n$. 

Let's experiment: 
  • $(1,-N) \mapsto (2N+1, N^2-2) \mapsto (2N^2+4N+5, N^4+\dots) \mapsto \dots$
  • $(-1,-N) \mapsto (2N+1, N^2+2) \mapsto (2N^2+4N-3, N^4 + 4N^2-4N+2) \mapsto \dots$
It seems like $(2\sqrt z, z)\mapsto \approx (2z,z^2)$. Maybe we have an inductive inequality $4y < x^2$? Let's see: $(x^2-2y)^2 -4(y^2-2x)= x^2(x^2-4y)+8x.$ So it's all good if $x>0$. So we should show everything by induction together: 
  • Starting with Case II: $a_1+b_1<-2$, we have $3<a_2<b_2$, and $a_2^2-4b_2 = a_1^2(a_1^2 - 4b_1) + 8a_1 > a_1^2(a_1^2+4a_1 + 8)+8a_1 = a_1(a_1+2)(a_1^2+2a_1+4)$. So if $a_1 > 0$ or $a_1 < -2$, we have $a_2^2-4b_2 > 0$. 
  • Then $3<a_2<b_2, 4b_2 < a_2^2 \implies 3<a_3<b_3, $4b_3 < a_3^2$ (since $a_3^2-4b_3 = a_2^2(a_2^2 - 4b_2) + 8a_2 > 0$). 
  • By induction, $3<a_n<b_n < a_n^2/4$ for $n\geq 2$.
  • So $a_{n+1} > a_n$, since $a_3 = a_2^2-2b_2 > a_2^2/2 > a_2$. 
So we have settled Case I for $a_1>0$ or $a_1<-2$. And in $-2<a_1<0$, we are also done if $b_1<\frac{a_1^3+8}{4a_1}$. That settles Case II up to some region. 

Let's transport our Case I progress to the $(a_1,b_1)$-plane. Note the induction is the same, so we still have a win in the same region $\{x>y, x>3\}$. 

Summary of progress, this time drawn on the $(a_1,b_1)$-plane:
That's great: we're now left with a bounded triangle region (yellow) and a strip of red region to settle. The end is near.

4. The Yellow Region

We know $(-1,-1)\mapsto (3,3)$. And $(3,3)$ is a fixed point, but the 1-variable dynamics worked out early above suggests it is divergent there (which is good: that means things near the top-right corner should get pushed to the centre of the triangle, which then gets a chance to leave). 

Let's iterate once. The condition for $f(x,y)$ to lie in the good region is $x^2-2y > 3$. So actually, the region underneath the parabola is also good!
Maybe we can iterate to inductively exhaust the yellow region this way.

Let's analyse dynamics close to $(3,3)$: 
$$(3-\varepsilon, 3-\eta) \mapsto \approx (3 - 6\varepsilon + 2\eta, 3 - 6\eta + 2\varepsilon)$$
Note this is not just a dilation, but a "push away from the diagonal". Concretely, $\frac{6\eta - 2\varepsilon}{6\varepsilon- 2\eta} > \frac{\eta}{\varepsilon} > 1$. 
[to see this: relative to $(3,3)$, the map sends $(-\varepsilon,-\varepsilon)\mapsto (-4\varepsilon,-4\varepsilon)$ and $(0,-\eta) \mapsto (2\eta, -6\eta)$. So it has a dilation effect, plus an "opening of the fan". 

We are looking for a monovariant that captures this. Let's try something related to the gradient that respects our existing inequalities: $t = \frac{x-y}{3-x} = \frac{\eta - \varepsilon}{\varepsilon} > 0$. (If $t<0$, we hit the green region and are done anyway.)
$$t\mapsto t' = \frac{(x-y)(x+y+2)}{3-x^2+2y}$$
To show $t'>t$, it suffices to show $(3-x)(x+y+2) > 3-x^2+2y$, or equivalently, $x+y+3-xy > 0$. This is true, since $LHS = 4 - (x-1)(y-1) > 4-|x+y-2|^2/4 > 0$. 

So the value of $t$ keeps increasing. But we wish for it to increase strictly, and by a significant amount! How much? Note $t' \approx \frac{8(\eta-\varepsilon)}{6\varepsilon-2\eta} > \frac{8(\eta-\varepsilon)}{4\varepsilon} = 2t$. So maybe we can try to prove $t' > (1+\alpha)t$ for some $0<\alpha\leq 1$. 

Unwinding the inequality, we wish to show $3+x+y-xy > \alpha(3-x^2+2y)$. Let's try to prove it for $\alpha=1$. Then it is equivalent to $x^2+x-y-xy > 0$, or $(x+1)(x-y) > 0$. We know that is true for the yellow region!

So $t$ increases by at least a factor of $2$ each time, thus gets arbitrarily large. But now we are done: all we need to do is to beat the gradient of $x^2-2y = 3$ at $(3,3)$, so it lies below the parabola, and the next step will send it into the $x>3$ region. This is when $t>2$. 

That's the yellow region settled! 

5. The Red Region

Let's return to Case II, the red region.

Since $(-1,-1)\mapsto (3,3)$, we need something different from $4y<x^2$ (this is false for $x=y=3$). Maybe some inequality like $4y < (x+c)^2+d$? Let's try: $(x'+c)^2 - 4y' = (x^2-2y+c)^2 - 4(y^2-2x) = x^4 - 4x^2y + 2cx^2 - 4yc + c^2 + 8x = x^2(x^2-4y) + \dots$ Well, whatever inequality it is, it must be satisfied by the $(-1,-N)$-chain above. Let's do the computation relative to $(-1,-1)$:
  • $(-1,-1-X) \mapsto (2X+3, X^2+2X+3) \mapsto (2X^2+8X+3, X^4 + 4X^3+10X^2+8X+3) \mapsto \dots$
Let's try a different strategy. $a_3>a_2$ is equivalent to $a_2 + 2b_2 < a_2^2$, so let's try to prove that directly. Note $(x^2-2y) + 2(y^2-2x) < (x^2-2y)^2\iff  x^4 - 4x^2y + 2y^2 + 4x + 2y - x^2 > 0$. To prove this, we note equality at $(-1,-1)$ and $(3,3)$, so our proof must preserve these as equality cases. 
  • The leading terms are captured by $(x^2-2y)^2$. We want an asymptotic $(x^2-4y)$-term, but also wish for equality at $(x,y) = (-1,-1)$ and $(3,3)$. So we try the term $(x^2-4y+2x-3)$
  • Now, $\text{expn} = (x^2-4y+2x-3)^2 -4(y-x)(x^2-4y+2x-3)+4x^2y+9x^2+4x+16x-14y^2-22y-9$.
New strategy:
  • The expression is quadratic in $y$, so it is true if $y > x^2 - 1 + \frac{1}{2}\sqrt{2x^4-2x^2-8x+1}$ by the quadratic formula. 
[at this point, I tried a bunch of more ways to obtain the bound by guessing inductive inequalities, but ran out of time.]

6. Summary of Progress

Summary of problem solving process:
  • Identified dynamics nature of problem. 
  • Understood the diagonal behaviour ($x\mapsto x^2-2x$). 
  • Observed (by iterating twice) that $a_n-b_n$ sign fixed after $n\geq 2$, related to the sign of $a_1+b_1+2$. (motivates the splitting of cases: we expect different behaviour for different regimes.)
  • Understood the $\{x>y, x>3\}$ regime (an infinite region of Case I), similar to the diagonal dynamics. 
  • By testing on $(\text{small}, N)$ for large $N$, motivated an inductive $4b_n < a_n^2$ bound for some of Case II. Reduced Case II to a red strip region. 
  • For Case I, reduced the problem to the yellow triangular region. 
  • Taking one step then clears the blue region below the parabola. This suggests taking more steps will slowly exhaust the yellow region; the focus is now to find an appropriate parameter of exhaustion.
  • By testing on $(3-\varepsilon, 3-\eta)$, motivated the definition of $t_n$ and the inductive $t_{n+1}>2t_n$. 
Summary of formal proof (up to the red region):
  • Lemma 1: $a_n-b_n$ same sign for $n\geq 2$ (and matches sign of $a_1+b_1+2$).  
  • Case 0: $a_N=b_N$ for some $N$. (and we get here if $a_1+b_1 = -2$.)
    • Then $a_N=b_N>3$, and $x\mapsto x^2-2x > x$ for $x>3$. So $a_n$ is thereafter strictly increasing. 
  • Case I: $a_1 + b_1 > -2$.
    • Induct that if $a_N>3$ for some $N$ (possibly $N=1$), then $3 < a_n < a_{n+1}$ for all $n\geq N$. [and, $a_n=3\implies a_{n+1}>3$.]
    • Define $t_n = \frac{a_n-b_n}{3-a_n}$, and induct that $t_{n+1}\geq 2t_n > 0$ for all $n$ (or we end up with Case 0, or with $a_n\geq 3$ for some $n$). 
    • So $t_n>2$ for some $n$; then show $a_{n+1}>3$.
  • Case II: $a_1 + b_1 < -2$. 
    • If $(a_1,b_1) \notin R = \{-2<x<0, \frac{x^3+8}{4x}<y<x\}$,
      • Induct that $3<a_n<b_n$ and $4b_n < a_n^2$ for all $n\geq 2$.
      • Then induct that $a_n < a_{n+1}$ for all $n\geq 2$. 
    • Region $R$ is yet to be resolved.

6. Reading the AoPS solutions

Wow. 

The intended solution was to observe that $(x+y+z)^2 - 2(xy+xz+yz) = x^2+y^2+z^2$, so if we have $a_1 = x+y+z$, $b_1= x^{-1}+y^{-1}+z^{-1}$, and $xyz=1$, then by induction, $a_n = x^{2^{n-1}}+y^{2^{n-1}}+z^{2^{n-1}}$ for all $n$. 

The $x,y,z$ would then be the roots of $P(s) = s^3 - a_1s^2+b_1s - 1$. If all roots are real, we are good after $n\geq 2$; otherwise, if $x$ is real while $y,z$ are complex conjugates, then $a_1 > b_1 \implies P(1) < 0\implies x > 1$, so $|y|,|z|<1<x$, so the behaviour of $a_n$ is eventually dominated by $x^{2^{n-1}}$, which is good. 

I'm not entirely sure how one may motivate this solution, but it is certainly an entirely different approach to the dynamics. Instead of studying behaviour in various regions, this approach seems to go straight for an explicit expression for the $(a_n,b_n)$'s, which is possible since the maps are algebraic (and extremely nice). I'm not sure how one could think up this expression though. (If someone can suggest reasonable motivation, I'd love to include it below in a future edit.)

One thing it does confirm is that Case II only happens when the roots of the cubic are all real (since then $P(-1)>0$ while $P(0)<0$), so both $a_n$ and $b_n$ will indeed monotonically increase after $n\geq 2$.

7. Unrelated Philosophical Ramblings

Recently, I had a conversation with a friend about how maths Olympiad differs from "regular maths". I brought up that there's the comparison of content, and of style of instruction.

Let's illustrate the difference with two different 1-hour sessions with similar content (say, questions like "how many ways are there to pick a committee of 3 and their leader from a group of 10 people?" and "how many ways are there to arrange the letters of "APPLE" such that the vowels aren't next to each other?") but different instruction styles.

Session 1: The expressions for $nPr$ and $nCr$ are introduced, and it is proven/argued why they represent certain counts. It is illustrated how to use them to answer numerous questions of increasing difficulty. Students then attempt problems similar to those presented.

At the end of the session, students have learnt how to recognise when questions are asking for an application of $nPr$ and/or $nCr$, how to identify what $n$ and $r$ are in each scenario, and how to compute the counts.

Session 2: Students start working through problems. Without any guidance, they might begin by listing all possibilities; but listing in itself is not easy unless they've found a systematic way (e.g. alphabetical order). When the numbers start getting big, they might start to look for patterns, clear entire cases with shortcuts, and search for faster ways. The trainer facilitates student sharing of strategies midway or at the end.

At the end of the session, students might have discovered lexicographic listing/depth/breadth-first search, developed the skill of systematic accounting of cases (avoiding double-counting or missing cases), and some might have even discovered induction/iterative choice/multiplicative counting, complementary counting, or double counting techniques. These may be advanced topics, but they may arise naturally in self-exploration. 

What are the concrete differences? 
  • Session 1's scope is clear and determined, with all problems intentionally chosen to highlight the same concept. In contrast, Session 2 has no uniform takeaway; the questions are of maximum diversity and breadth within the broad area of counting. 
  • Session 1 prioritises learning by reinforcement: students learn the idea and recognise when to use it. Session 2 prioritises learning by exposure and self-discovery: students get the chance to experience why the idea is powerful and useful "in the wilderness". [In possibly oversimplified terms: the sessions train pattern recognition at different levels.]
  • While the variance of learning for Session 2 is high (and there's a chance that some students end up learning nothing at all), whatever ideas picked up by the students have a higher chance of being retained, and eventually translatable to different contexts. [Because of this, the role of the educator is much more important.]
Introspecting, I'm now worried I might have lost a lot while learning undergraduate maths (group theory, linear algebra, calculus, ...) in the traditional way. There are most definitely things that I learnt and think I know, but don't understand deeply (in fact, I'm probably blind to my remaining gaps of understanding). 

8. Conclusion

Here's maths in practice, with all its natural messiness. I illustrated finding monovariants of geometric quantities (the definition of $t$ capturing the gradient relative to $(3,3)$), making inductive hypotheses, and some multivariable calculus (the $\epsilon,\eta$ computations near $(3,3)$). Unfortunately, my attempt was unsuccessful; some day, I'll finish up the proof above, or learn the motivation for the algebraic solution. 




Comments

Popular posts from this blog

SMO Open 2024 ??% Speedrun

Musings about the IMO