A long time ago I noticed that I sometimes already had a complete thought before...

rcxdude · 2025-11-04T13:07:43 1762261663

Indeed, and it seems like they would really struggle to output coherent text at all if there was not some kind of pre-planning involved (see how even humans struggle with it in games where you have to construct a sentance by having each person shout out one word at a time). Even GPT-2 likely had at least some kind of planning for the next few words in order to be as coherent as it was.