The main difference between GPT5 and a PhD-level new hire is that the new hire w...

ben_w · on Nov 14, 2024

Human interaction with peers is also guidance.

I don't know how many team meetings PhD students have, but I do know about software development jobs with 15 minute daily standups, and that length meeting at 120 words per minute for 5 days a week, 48 weeks per year of a 3 year PhD is 1.296.000 words.

eastbound · on Nov 14, 2024

I have 3 remote employees whose job is consistently as bad as LLM.

That means employees who use LLM are, on average, recognizably bad. Those who are good enough, are also good enough to write the code manually.

To the point I wonder whether this HN thread is generated by OpenAI, trying to create buzz around AI.

ben_w · on Nov 14, 2024

1. The person I'm replying to is hypothesising about a future, not yet existent, version, GPT5. Current quality limits don't tell you jack about a hypothetical future, especially one that may not ever happen because money.

2. I'm not commenting on the quality, because they were writing about something that doesn't exist and therefore that's clearly just a given for the discussion. The only thing I was adding is that humans also need guidance, and quite a lot of it — even just a two-week sprint's worth of 15 minute daily stand-up meetings is 18,000 words, which is well beyond the point where I'd have given up prompting an LLM and done the thing myself.

whiplash451 · on Nov 15, 2024

They definitely tell you jack. GPTs have reach their glass ceiling as they’ve sucked all available data and overfit to benchmarks.

Their models have tons of use cases, but OpenAI and Anthropic are now in a product/commercial play.

ben_w · on Nov 16, 2024

That's one possibility.

Rumours have been in abundance since GPT-4 came out due to on the lack of clarity, but that lack of clarity seems to also exist within the companies themselves.

OpenAI and Anthropic certainly seem up be doing a lot of product stuff, but at the same time the only reason people have for saying OpenAI not making a profit is all the money they're also spending on training new models — I've yet to use o1, it's still in beta and is only 2 months old (how long was gmail in "beta", 5 years?)

I also don't know how much self-training they do, training on signals from the model's output and how users rate that output, only that (1) it's more then none, that (2) some models like Phi-3 use at least some synthetic data[0], and (3) that making a model to predict how users will rate the output was one of the previous big breakthroughs.

If they were to train on almost all their own output, and estimaing API costs as approximately actual costs, and given the claimed[1] public financial statements, that's in the order of a quadrillion (1e15) tokens, compared to the mere ~1e13 claimed for some of the larger models.

[0] https://arxiv.org/abs/2404.14219

[1] I've not found the official sources nor do I know where to look for them, all I see are news websites reporting on the numbers without giving citations I can chase up