More polynomial regression questions :) Are you using the optimal step size you derived in the previous section? If so, why don't the first and last eigendirections converge at once?
If not, doesn't it suggest that there's a trade-off between speed of convergence and the ability to stop early?
In general, this is a nicely written article, though. Good work!
Good question! The parameter has been set to a touch below the optimum. Your observation is accurate, there is indeed such a tradeoff, though it is less than you might think. The qualitative behavior of the system is very sensitive to changes at that point, and the tiny bit of extra convergence you get by getting the step-size exactly right of offset dramatically by the chances of diverging.