How To Understand A Math Olympiad Solution
Hi, this is Choo Ray.
I have been conducting trainings quite a bit recently, so I have forced myself to get better at understanding the key points of solutions in a manner which I can convey to others. The difficulty is that math olympiad takes time. I am lucky if I get to try all the problems in a problem set I create, much less solve them. However, I still need to have a good enough understanding of the problem to gather the important ideas and teach them to others.
Looking back on my years as a contestant, I realised that I frequently forget questions and solutions. I slightly envy those who can recall questions with ease ("this is 1994 ISL G5" comes to mind). The problem is not with forgetting the solution, but with not being able to come up with the solution even after seeing it before. It reflects that even if I previously understood and verified the solution to be correct, I did not understand how someone might have arrived at it.
About memory
We know from cognitive science and linguistic studies that our memory is astonishingly bad with large amounts of random things. One technique we use on a daily basis to aid memory is to find patterns and groupings to help our brain process things more quickly. For example, "gthuysnc" is harder to remember than "thugsync", because our brain can group the latter into parts that we are already familiar with.
Math olympiad solutions are, in silo, a large chunk of text and mathematical expressions, therefore being antagonistic to memory. We are lucky if we get remarks from the author offering insights, but oftentimes we are on our own in parsing and digesting the solutions. Thus we need to develop a systematic approach to understanding solutions.
Identifying key ideas is not enough
One way people try to understand solutions is by picking out the key idea. Oftentimes this is something like "invariant on sum mod n" or "apply Cauchy-Schwarz".
The problem is that having a list of key ideas and trying to throw them at any remotely relevant problems is still terribly inefficient. There are just too many things to try when you do not have an effective way of narrowing down the most promising approaches. I see this first-hand in my lessons. People will be struggling with a problem for quite some time exploring all sorts of unfruitful ideas, but after I say "hey, here's a list of basic things related to this topic", suddenly they can pick out the right approach, because I have narrowed the list for them.
Another problem with just key ideas is that if you are like me, if the solution is not clear from start to finish, the solution is not clear at all. Identifying key ideas can sometimes lead to neglecting certain details or technical tricks that are still important in producing a complete solution.
How to approach a solution
The most important thing is to remember that solutions do not exist alone but in relation to a problem. Using an analogy from computer science, we want to improve our internal problem solving algorithm by feeding ourselves training data, with problems as inputs and solutions as outputs. To enhance our problem solving algorithm we need to find connections between features in the problem and features in the solution. Therefore, ask yourself:
- What are the most distinctive features of the problem? Do they correspond to any ideas in the solution?
- Do I know of any other similar problems? Do their solutions have common characteristics? Do common characteristics in the solutions correspond to the common features in the problems?
- Which parts of the solution would someone first come up with? (It is frequently not the beginning of the solution as written!)
- Are there specific choices made during the course of the solution? Why were these choices made? Do the alternatives also lead to a possible solution?
- Are there any immediate generalisations I can make using the approach used to solve the problem?
I wish I had spent more time studying these connections a few years ago, as no doubt this would have helped me internalise ideas better and made me a stronger problem solver as a whole.
An example
Consider the following problem and solution:
(Brazil Mathematical Olympiad 2021 Problem 5) Find all triples of non-negative integers $(a,b,c)$ such that $$a^{2}+b^2+c^2=abc+1$$
I used this question in my trainings and there were a handful of students who were able to identify that Vieta jumping was the most promising approach but yet fail to solve the question. The problem is that most people think they understand how Vieta jumping works but are not able to see through to the fundamental ideas behind it. As a result, when the question requires slightly more than the standard cut-and-paste model of Vieta jumping, they get stumped easily.
Quick recap on the traditional conception of Vieta jumping: Express the equation as a quadratic in one of the variables. Therefore, the other root of the quadratic gives us another valid solution. Vieta's formulae tells us the sum and product of roots $x_1+x_2=-\frac{b}{a}$ and $x_1x_2=\frac{c}{a}$, where $a,b,c$ are the coefficients of the quadratic (not to be confused with $a,b,c$ in the question above). Verify that the other root has the desired properties (say, being a positive integer). By applying this process of changing one of the variables repeatedly, we either show that we can generate infinite solutions (if we manage to find a small solution and show that we can get a bigger root each time) or that we have virtually no solutions (if we show that we can always get a smaller root each time, which cannot happen since a bounded sequence of integer tuples cannot decrease infinitely).
There were quite a few solutions in the AoPS post to choose from. I chose to follow the solution from RodSalgDomPort as it was decently written, illustrated the secondary ideas well, and the fact that it was a solution produced in contest time is encouraging. Let us dissect it.
Post by RodSalgDomPort
Solution at contest:
Set $a\geq b \geq c$.
Notice that $3a^2 \geq abc+1 > ac^2 \Rightarrow 3a>c^2$. Now assume that $(a,b,c)$ is the solution for the equation with all numbers $>0$ and with minimum sum. Since the equation: $x^2- xbc + b^2+c^2-1$ has two positive solutions $(a, \frac{b^2+c^2-1}{a}),$ then:
$$\frac{b^2+c^2-1}{a}\geq a.$$ $$\Rightarrow a+3>\frac {a^2 + 3a -1}{a}> \frac{a^2 + c^2 -1}{a} \geq \frac{b^2 + c^2 -1}{a} \geq a.$$ Therefore, $\frac{b^2+c^2-1}{a}=a,a+1 \text{ or } a+2$. Meaning that the delta of the equation $x^2- xbc + b^2+c^2-1$ is $0,1 \text{ or } 4$.
Then: $(bc)^2- 4(b^2+c^2)+4 = 0, 1 \text{ or } 4$ $\Rightarrow (b^2-4)(c^2-4) =12, 13 \text{ or } 16$ but none of them have solution with $b,c>0$. Contradiction.
Then one of the guys is $0$, meaning the only solution is $(1,0,0)$ and permutations.
Most people should be able to tell that the key idea here is indeed Vieta jumping. However, if you just take a brief look at the solution, say it is Vieta jumping and move on, it is unlikely for you to be able to recreate the solution after some time.
How do we recognise that Vieta jumping is the most likely approach in the first place? In this case, it is not really difficult to tell the defining features that make Vieta jumping work. Of course, the most important is that the equation is a quadratic with respect to each variable. This allows us to use Vieta's formulae to determine another solution $(a',b,c)$ given $(a,b,c)$ is a solution.
Another noticeable feature of the problem is that there is symmetry between the variables, which allow us to order the size of solutions easily when decreasing any one variable with jumping. Thinking further, it is not strictly necessary for there to be symmetry for Vieta jumping to work and possibly another problem could exist whereby you jump different variables depending on the current state of the variables, but certainly having symmetry is very helpful.
Next, we realise that a large part of the solution revolves around bounding the other solution given by Vieta's formulae to $a+3>a'=\frac{b^2+c^2+1}{a}\ge a$. Why are we looking to bound the other solution? Where does this fit in with our knowledge of how Vieta jumping works?
In our usual conception of Vieta jumping as an infinite descent method, we want to show $a'<a$, so that the size of the solution, say $a+b+c$, is strictly decreasing, giving a contradiction. Here, the clever use of the minimal solution is just a stylistic preference. Instead of saying solutions cannot decrease infinitely, we use the fact that a minimal solution cannot decrease. In fact, practically all infinite descent solutions can be written as extremal solutions (although not always vice versa), just that infinite descent may be a more natural formulation in certain contexts.
Here, what must have happened is that it is not easy to immediately get as tight a bound as $a'<a$, so the extremal formulation makes sense as we try to consider what happens when the bound is not met. However, in our efforts to bound $a'$ we realised that we can get a close bound of $a'<a+3$, leaving us to deal with the few in-between cases in a separate way.
To put it another way, the solution works by showing that if $a'$ is much larger than $a$, it is impossible by bounding. If $a'$ is slightly larger than $a$, it is impossible by case work. Finally, if $a'$ is always smaller than $a$, it is still impossible by good old (new?) Vieta Jumping.
Now we have a good idea of the overall structure of the solution. Next we can examine some of the finer details.
- How did we set up for the bounding to take place?
We used an ordering of $a\ge b\ge c$ and chose to bound $a'$. This makes sense as intuitively the largest variable is most likely to have a decrease. - How was the bounding done/how do we know that some good bound is available?
We notice that in the original equation, $a^2+b^2+c^2$ is of degree $2$ and $abc$ is of degree $3$, which suggests that there should be some way to bound to exploit this. Indeed, the crucial $\frac{c^2}{a}<3$ comes from applying our imposed ordering to the difference in degrees. - How did we deal with the edge cases of $a'=a,a+1\text{ or }a+2$?
Rather than just mindlessly substituting things back into earlier equations and trying to solve that way, we realise that the difference of roots of a quadratic has a nice interpretation, which allows us to find some symmetric equation in $b,c$. From there, the factorisation into $(b^2-4)(c^2-4)$ given $b^2c^2-4(b^2+c^2)+\text{ loose terms}$ is just standard technique. - Where did the side solution $(1,0,0)$ come into play?
It arose as the result of an edge case in our Vieta jumping, since it jumps to $(-1,0,0)$. We can also verify that if we try to jump the $0$ instead of $1$ we get back $(1,0,0)$. RodSalgDomPort's solution could (and should) have been made clearer by explicitly checking that $a'$ is a positive integer in other cases (it is integer since $a'=bc-a$, and positive since $b^2+c^2-1$ is positive when $b,c$ are nonzero), which is an important component to Vieta jumping. - Why did we use $a'=\frac{b^2+c^2-1}{a}$ to bound instead of $a'=bc-a$?
You tell me!
Conclusion
Wow! That example was quite long, even longer than the solution itself. It is no wonder that people don't usually include all this analysis in their solutions.
I hope you agree that after following the analysis, you have gained some insight into the inner workings and motivations behind the solution. For myself, I feel that after having gone through this deep dive into the solution, if you asked me to solve this problem a few months from now, I would be able to confidently produce the solution. I even feel more confident to tackle other Vieta jumping problems, since now I know that using it in conjunction with bounding ideas can make the technique more powerful.
I wish you will have great success in discovering the fundamental ideas behind mathematical proofs in your own explorations!
Comments
Post a Comment