It's silly to say that the only objective that will vindicate AI investments is AGI.
Current batch of deep learning models are fundamentally a technology for labor automation. This is immensely useful in itself, without the need to do AGI. The Sora2 capabilities are absolutely wild (see a great example here of what non-professional users are already able to create with it: https://www.youtube.com/watch?v=HXp8_w3XzgU )
So only looking at video capabilities, or at coding capabilities, it's already ready to automate and upend industries worth trillions in the long run.
The emerging reasoning capabilities are very promising, able to generate new theories and make scientific experiments in easy to test fields, such as in vitro drug creation. It doesn't matter if the LLM hallucinates 90% of the time, if it correctly reasons a single time and it can create even a single new cancer drug that passes the test.
These are all examples of massive, massive economic disruption by automating intellectual labor, that don't require strict AGI capabilities.
Regardless of my opinions on if you're correct about this, I'm not an ML expert so who knows, I'd be very happy if we cured cancer so I hope you're correct and the video is a cool demo.
I don't believe the risk vs reward on investing a trillion dollars+ is the same when your thesis changes from "We just need more data/compute and we can automate all white collar work"
to
"If we can build a bunch of simulations and automate testing of them using ML then maybe we can find new drugs" or "automate personalized entertainment"
The move to RL has specifically made me skeptical of the size of the buildout.
If you calculate the investment into AI and then divide by say 100k that's how many man-years need to replace with AI to be cost effective as labor automation the numbers aren't that promising given the current level of capability.
Don't even need to get too fancy with it. Open AI has publicly committed to ~$500B in spending over the next several years (nevermind even they don't expect to actually bring that much revenue in)
$500B/$100,000 is 5 million, or 167k 30-year careers.
The math is ludicrous, and the people saying it's fine are incomnprehensible to me.
Another comment on a similar post just said, no hyperbole, irony, or joke intended: "Just you switching away from Google is already justifying 1T infrastructure spend."
>Just the disruption we can already see in the software industry are easily of that magnitude.
WTF ? Where are you seeing that ?
Also no you can't calculate 100k over 30 years as 3M because you expect investment growth - lets say stock market average of 7 percent per year that investment must return like 24 million in 30 years otherwise its not worth it. That means 8 trillion in next 30 years if you look over that long of an investment period.
And who in the hell is going to capture 30 years of profit with model/compute investments made today.
The math only maths within short timeframes - hardware will get amortized in 5 years, model obsolete in even less. So best case scenario you have to displace 2 million people and capture their output to repay that. Not with future tech - with tech investments made today.
The global employment in software development and adjacent is in the tens of millions. To say the impact of AI code automation will be, at max, a rounding error of just 1-2% of that is just silly; currently, the junior pipeline is almost frozen in the global north, entire batches of graduates can't find jobs in tech.
Sure, the financial math over 30 years does not follow elementary arithmetic, and if the development hits a wall tomorrow they will have trouble recovering the investment just from code automation tools.
But this is a clearly nonsense scenario, the tech is rapidly expanding to other fields that have obvious potential to automate. This is not a pie-in the sky future technology yet to be invented, it's obvious productization of latent capability, similar to the early internet days. There might be some overshoots but the latent potential is all there, the AI investments are looking to be the first movers in that enormously lucrative space and take, what seem to me, reasonable financial risks in light of the rewards.
My claim is not that AGI will soon be available, but that applying existing frontier models on the entire economy, in the form of mature, yet to be developed products, will easily generate disruption that has a present value in the trillions.
You do understand that you don't replace a 100k developer and call it a day - you have to charge the same company 100k for your AI tools. No model is nowhere near close today - they are having trouble convincing enterprises to pay less than 100$ per employee. The current models do not math at all, the only way these investments work is if models get fundamentally better.
> Current batch of deep learning models are fundamentally a technology for labor automation. This is immensely useful in itself, without the need to do AGI. The Sora2 capabilities are absolutely wild (see a great example here of what non-professional users are already able to create with it: https://www.youtube.com/watch?v=HXp8_w3XzgU )
> So only looking at video capabilities, or at coding capabilities, it's already ready to automate and upend industries worth trillions in the long run.
Can Sora2 change the framing of a picture without changing the global scene ? Can it change the temperature of a specific light source ? Can it generate a 8k HDR footage suitable for re-framing and color grading ? Can it generate minute long video without loosing coherence ? Actually, can it generate a few seconds without having to reloop with the last frame and have these obnoxious cuts that the video you pointed has ?
Can it reshoot the same exact scene with just one element altered ?
All the video models right now are only good at making short, low-res, barely post-processable video. The kind of stuff you see on social media. And considering the metrics on ai-generated video on social media right now, for the most part, nobody want to look at them. They might replace the bottom of the barrel of social media posting (hello cute puppy videos), but there is absolutely nothing indicating that they migth automate or upend any real industry (be used in the pipeline, yeah maybe, why not, automate ? Won't hold my breath).
And the argument of their future capabilities, well ... It's been 50+ years that we should have fusion in 20 years.
Btw, the same argument can be made for LLM and image-gen tech in any creative purposes. People severly underestimate just how much editing, re-work, purpose and pre-production steps are involved in any major creative endeavor. Most model are just severly ill suited for that work. They can be useful for some stuff (specificaly, for editing images, ai-driven image fill do work decently for exemple), but overall, as of right now, they are mostly good at making low quality content. Which is fine I guess, there is a market for it, but it was already a market that was not keen on spending money.
Qwen image and nano banana can both do that with images, there’s zero reason to think we can’t train video models for masking.
This feels a lot like critiquing stable diffusion over hands and text, which the new SOTA models all handle well.
One of the easiest iterations on these models is to add more training cases to the benchmarks. That’s a timeline of months, not comparable to forecasting progress over 20 years like fusion.
Is it now. I don't think being able to accurately and predictably make changes to a shot, a draft, a design is surface level in production.
> Qwen image and nano banana can both do that with images, there’s zero reason to think we can’t train video models for masking.
Tell them to change the tilt of the camera roughly 15 degree left without changing anything else in the scene and tell me if it works.
> This feels a lot like critiquing stable diffusion over hands and text, which the new SOTA models all handle well.
Well does a lot of heavy lifting there.
> One of the easiest iterations on these models is to add more training cases to the benchmarks. That’s a timeline of months, not comparable to forecasting progress over 20 years like fusion.
And what if the model itself is the limiting factor ? The entire tech ? Do we have any proof that in the future the current technologies might be able to handle the cases I spoke about ?
Also, one thing that I didn't mention in the first post. Assuming that the tech does come to the point I can be used to automate a lot of the production. If Throwing a few millions to buy a GPU cluster is enough to be able to "generate" a relatively high quality movie or series, the barrier to entry will be incredibly low. The cost will be driven down, the amount of production will be very high and overall it might not be a trillion dollar industry no more.
The problem is that it’s already commodified; there’s no moat. The general tech practice has been capture the market by burning vc money, then jack up prices to profit. All these companies are burning billions to generate a new model and users have already proven there is no brand loyalty. They just hop to the new one when it comes out. So no one can corner the market and when the VC money runs out they’ll have to jack up prices so much that they’ll kill their market
> The problem is that it’s already commodified; there’s no moat.
From an economy-wide perspective, why does that matter?
> users have already proven there is no brand loyalty. They just hop to the new one when it comes out.
Great, that means there might be real competition! This generally keeps prices down, it doesn't push them up! It's true that VCs may end up unhappy, but will they be able to do anything about it?
You seem to be making an implicit claim that LLMs can create an effective cancer drug "10% of the time".
Smells like complete and total bullshit to me.
Edit: @eucyclos: I don't assume that Chat GPT and LLM tools have saved cancer researchers any time at all.
On the contrary, I assume that these tools have only made these critical researchers less productive, and made their internal communications more verbose and less effective.
No, that's not the claim. The claim is that we will create a hypothetical LLM that, when tasked with a problem at the scientific frontier of molecular biology will, about 10% of the time, correctly reason about existing literature and reach conclusions that are valid or plausible to similar experts in the field.
Let's say you run that LLM one million times and get 100.000 valid reasoning chains. Let's say among them are variations on 1000 fundamentally new approaches and ideas, and out of those, you can actually synthesize in the laboratory 200 new candidate compounds, and out of those, 10 substance show strong in-vitro response, and then one of those completely cures some cancerous mice.
There you go, you have substantially automated the intellectual work of cancer research and you have one very promising compound you can start phase 1 trials that you didn't have before AI, and all without any AGI.
Current batch of deep learning models are fundamentally a technology for labor automation. This is immensely useful in itself, without the need to do AGI. The Sora2 capabilities are absolutely wild (see a great example here of what non-professional users are already able to create with it: https://www.youtube.com/watch?v=HXp8_w3XzgU )
So only looking at video capabilities, or at coding capabilities, it's already ready to automate and upend industries worth trillions in the long run.
The emerging reasoning capabilities are very promising, able to generate new theories and make scientific experiments in easy to test fields, such as in vitro drug creation. It doesn't matter if the LLM hallucinates 90% of the time, if it correctly reasons a single time and it can create even a single new cancer drug that passes the test.
These are all examples of massive, massive economic disruption by automating intellectual labor, that don't require strict AGI capabilities.