The model does think but only when you tell it to think out loud. This is less a...

The model does think but only when you tell it to think out loud.

This is less a weird quirk of the training data or a One Weird Trick That Makes Your Matricies Sentient, and more a limitation of the model architecture. Neural networks do not have the capability to implement 'for loops', the only looping construct is the process that runs the model repeatedly on each token. When you tell the model to "think out loud", you're telling it to use prior tokens as for loop state.

Another limitation is that the model can't backtrack. That is, if it says something wrong, that lie is now set in stone and it can't jump back and correct it, so you get confidently wrong behavior. I have to wonder if you could just tell the model to pretend it has a backspace button, so that it could still see the wrong data and avoid the pitfalls it dropped into before.