Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Yes yes engineers make more than that blah blah but the cost would quickly jump out of control for bigger tasks.

Also (most) engineers don't hallucinate answers. Claude still does regularly. When it does it in chat mode via a flat rate Pro plan I can laugh it off and modify the prompt to give it the context it clearly didn't understand but if its costing me very real money for the LLM to over-eagerly over-engineer an incorrect implementation of the stated feature its a lot less funny.



Exactly! Especially agentic tools like Aider and Claude that are designed to pull in more files into their context automatically, based on what the LLM thinks it should read. That can very quickly go out of control and result in huge context windows.

Right now with Copilot or other fixed subscriptions I can also laugh it off and just create a new tab with fresh context. Or if I get rate-limited because of too much token use I can wait 1 day. But if these actions are linked to directly costing money on my card, then that's becoming a lot more scary.


> Also (most) engineers don't hallucinate answers.

They absolutely do, where do you think bugs come from? The base rate is typically just lower than current AIs.


Bugs from engineers comes from a variety of reasons and most have nothing in common with an LLM hallucinating.

For exemple I can’t remember seing a PR with an API that seems plausible but never ever existed, or an interpretation of the specs so convoluted and edgy that you couldn’t even use sarcasm as a justification for that code.

Don’t take me wrong: some LLMs are capable of producing bugs that looks like humans ones, but the term hallucinate is something else’s and doesn’t fit with much humans bugs.


> For exemple I can’t remember seing a PR with an API that seems plausible but never ever existed

A PR is code that has already been tested and refined, which is not comparable to the output of an LLM. The output of an LLM is comparable to the first, untested code that you wrote based off of your sometimes vague memory of how some API works. It's not at all uncommon to forget some details of how an API works, what calls it supports, the details of the parameters, etc.


But you don't submit that rough draft with the 110% conviction that it's correct, which is what an LLM will do when it hallucinates.

It won't say "I think it should look something like this but I might be wrong," it'll say "simple! here is how you do it."

Hence hallucination and not error. It thinks it's right.


It’s kind of uncommon to be aware that you have only a vague recall of the API and not go check the documentation or code to refresh your memory. That self knowledge that you knew something and aren’t sure of the details is indeed the thing that these tools lack. So far.


Human programmers have continuous assistance on every keystroke - autocomplete, syntax highlighting, and ultimately, also the compilation/build step itself.

For an LLM-equivalent experience, go open notepad.exe and make substantial changes there, and then rebuild - and let the compiler tell you what's your base rate of hallucinating function names and such.


In the 1990s, that is closer to what making software was like. There, one had an even more heightened awareness of how confident one was in what one was typing. We would then go to the manual (physical in many cases) and look it up.

And we never made up APIs, as there just weren't that many APIs. We would open the .h file for the API we were targeting as we typed into the other window. And the LLMs have ingested all the documentation and .h files (or the modern equivalent) so they don't have a real excuse.

But I use the LLMs all the time for math, and they do profusely hallucinate in a way people do not. I think it's a bit disingenuous to say that LLMs don't have that failure mode that people don't really have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: