You really think so? My goal with this post was to provide the non-hype commentary - hence my focus on model characteristics, pricing and interesting notes from the system card.
I called out the prompt injection section as "pretty weak sauce in my opinion".
The reason there's not much negative commentary in the post is that I genuinely think this model is really good. It's my favorite model right now. The moment that changes (I have high hopes for Claude 5 and Gemini 3) I'll write about it.
I called out the prompt injection section as "pretty weak sauce in my opinion".
I did actually have a negative piece of commentary in there about how you couldn't see the thinking traces in the API... but then I found out I had made a mistake about that and had to mostly remove that section! Here's the original (incorrect) text from that: https://gist.github.com/simonw/eedbee724cb2e66f0cddd2728686f... - and the corrected update: https://simonwillison.net/2025/Aug/7/gpt-5/#thinking-traces-...
The reason there's not much negative commentary in the post is that I genuinely think this model is really good. It's my favorite model right now. The moment that changes (I have high hopes for Claude 5 and Gemini 3) I'll write about it.