The price at the pump affects not only a voter's commuter car, but also every truck that delivers goods across the US. This may have a much larger knock-on effect.
OTOH the US is the largest oil producer in the world [1]. Theoretically the US could keep domestic prices in check, but that would require rather drastic administrative pressure, likely only legal at wartime.
This has to be intentional, right? To reassure people that front-end developers still have a job? The data is interesting but the site itself is a complete embarrassment for several reasons.
I think what you're describing is what people working with recommender systems call serendipity. Maximizing serendipity, while maintaining relatively high relevance/recommendation success rate, is supposedly a pretty difficult problem to solve. I'm not sure if LLMs have changed that.
This will sound snarky, so forgive me, but I honestly don't know the answer. Is this actually true? Is there a reliable source containing statistics on LLM compute usage that includes training vs inference for the whole market?
I don’t understand why people don’t just use Gemini or some other AI web search to get an answer to these kinds of questions quickly (I excluded the sources, you can get them if you ask the same question).
> While AI training is often the most intense and expensive process for a single model, the majority of total AI compute usage (approximately 90%) is used for inference.
> Here is the breakdown of why this is the case:
> Inference as High-Volume
> Activity: Inference occurs every time a user interacts with an AI model (e.g., asking ChatGPT a question, using image recognition, or generating code). While a model is trained once (or updated infrequently), it runs millions or billions of inferences continuously.
> Cost Scaling: Training is a massive, one-time upfront cost, while inference is an ongoing, daily operational cost. As the number of AI users grows, the demand for inference compute scales faster than the need for training new, large models.
> The Shift to Efficiency: While early AI hype focused on the immense compute needed for training, the industry has shifted toward making inference cheaper and faster through specialized hardware and techniques like optimization, quantization, and small language models (SLMs).
And I finally figured out how to get links to answers instead of just inlining the content as before. Anyways, there it is. We live in a time where questions like "Does inference or training use more compute?" can be answered quickly by just pasting it into a search box.
The revenue numbers are public for the major AI companies. That's probably the best estimate for "inference for the whole market" we have, since most of that inference is billed in either API usage or subscriptions, and it won't include any in-house usage such as training.
You obviously don't believe that AGI is coming in two release cycles, and you also don't seem to have much faith in the new models containing massive improvements over the last ones. So the answer to who is going to pay for these custom chips seems to be you.
If benchmarks are fishy, it seems their bias would be to produce better scores than expected for proprietary models, since they have more incentives to game the benchmarks.
Is "the end of the exponential" an established expression? There's no singularity in an exponential so the expression doesn't make sense to me. To me, it sounds like "the end of the exponential part", meaning it's a sigmoid, but that's obviously not what he means.
I’m guessing that Amodei meant it as a humorous inside joke.
It’s also shorthand for “the end of massive R&D capex” and “the transition to market capture”. The final stage, what McKinsey types call “harvesting”, is probably not on Amodei’s radar. Based on what I’ve seen of his public personality, he would see it as too philistine and will hand it off to another custodial exec.
>To me, it is absolutely wild that you have people — within the bubble and outside the bubble — talking about the same tired, old hot-button political issues, when we are near the end of the exponential.
My interpretation is "It's pointless to discuss the old political issues, because they're not going to be relevant once AGI is achieved". So if he does believe in a plateau, it either contradicts his other prediction (that AGI will be reached in a year or two), or he believes it will plateau after AGI is already reached, which means it's kind of a pointless statement. The important thing w.r.t. all our problems being solved would the advent of AGI, not the plateau.
I took the “end” to mean the part of the exponential where it quickly trends towards infinity. So let’s say the x axis is time (by which you get more training data and more compute) and the y axis is model ability. So far, if we think we are in the beginning of the exponential, adding data/compute looks almost linear to the untrained eye in terms of model capability. But once you hit a threshold, where he thinks the model will start to generalize, a small amount of data/compute will result in a massive increase in model ability.
>Maybe what surprised me most is that the mistakes NanoBananna made are simple enough that I'm absolutely positive Karpathy could have caught them. Even if his physics is very rusty. I'm often left wondering if people really are true believers and becoming blind to the mistakes or if they don't care.
I've seen this interesting phenomenon many times. I think it's a kind of subconscious bias. I call it "GeLLMann amnesia".
reply