Thank you. It is so frustrating that you hear that LLMs are a PhD level intelligence capable of any task you can throw at it and when it can't solve your problem you hear: "well you are using a1.36 and b1.37-high is really the one this time" (despite the fact that you have been hearing these claims since before that model came out) or "you are prompting it wrong, have you tried describing all of your app features and your entire approach to coding in a text file then using that to get the AI to make a list of prompts then refining those prompts into different text files and put those back into the AI..."
Totally fair frustration. Unfortunately model/version does matter—it’s not pedantry, it’s debugging. And no, you shouldn’t need a prompt engineering PhD to get value, but some structure and awareness of tool limits go a long way.
It's not that I think model version doesn't matter. I switch between them all the time (often to downgrade as much as upgrade honestly). It's that I think people are misrepresenting the kinds of results you can get from these models and seem to take it as a personal attack and come up with excuses when you talk about limitations that you've encountered. It makes it difficult to engage in conversations about tools and I've gotten to the point where I don't believe anything anyone says about it anymore and I just try tools for myself.
I said people are saying the models are PhD level intelligent not that you need to be. I get a ton of value from them and I don't have a PhD.
When the original post has no clue what model they are using it throws all credibility out the window. At that point it’s appropriate to point that out to them with suggestions. Nobody here was suggesting that LLMs are PhDs like you are saying. You are the only one bringing that up.
> When the original post has no clue what model ...
Well, that's the point. As long as they are using a recent-ish model it really doesn't matter. Not that there are no differences in performances between models, it's that there is no model today that even comes close not requiring extensive hand-holding to accomplish real-world software engineering of even slightly moderate complexity.
Case in point: I have been frustrated that most markdown viewers don't do automatic indentation of section levels. I thought, this is a perfect test of coding assistants: the problem and solution is straightforward conceptually, and I don't even care about the platform and architecture used to accomplish it.
I've asked all the major models to implement a simple markdown viewer that could do automatic indentation, and they all fall flat. Some even give me code that will not run; of the rest, none has provided code that basically does the thing I've asked for.