So if not exponential, what would you call adding voice and image recognition, function calling, greatly increased token generation speed, reduced cost, massive context window increases and then shortly after combining all of that in a truly multi modal model that is even faster and cheaper while adding emotional range and singing in… checks notes …14 months?! Not to mention creating and improving an API, mobile apps, a marketplace and now a desktop app. OpenAI ships and they are doing so in a way that makes a lot of business sense (continue to deliver while reducing cost). Even if they didn’t have another flagship model in their back pocket I’d be happy with this rate of improvement but they are obviously about to launch another one given the teasers Mira keeps dropping.
All of that is awesome, and makes for a better product. But it’s also primarily an engineering effort. What matters here is an increase in intelligence. And we’re not seeing that aside from very minor capability increases.
We’ll see if they have another flagship model ready to launch. I seriously doubt it. I suspect that this was supposed to be called GPT-5, or at the very least GPT-4.5, but they can’t meet expectations so they can’t use those names.
Isn’t one of the reasons for the Omni model that text based learning has a limit of source material. If it’s just as good at audio that opens a whole another set of data - and a interesting UX for users
I believe you’re right. You can easily transcribe audio but the quality of the text data is subpar to say the least. People are very messy when they speak and rely on the interlocutor to fill in the gaps. Training a model to understand all of the nuances of spoken dialogue opens that source of data up. What they demoed today is a model that to some degree understands tone, emotion and surprisingly a bit of humour. It’s hard to get much of that in text so it makes sense that audio is the key to it. Visual understanding of video is also promising especially for cause and effect and subsequently reasoning.