More

HarHarVeryFunny · 2025-12-10T15:10:21 1765379421

> The process of breaking a complex problem down into the right primitives requires great understanding of the original problem in the first place.

Yes, but with experience that just becomes a matter of recognizing problem and design patterns. When you see a parsing problem, you know that the simplest/best design pattern is just to define a Token class representing the units of the language (keywords, operators, etc), write a NextToken() function to parse characters to tokens, then write a recursive descent parser using that.

Any language may have it's own gotchas and edge cases, but knowing that recursive descent is pretty much always going to be a viable design pattern, you can tackle those when you come to them.

HarHarVeryFunny · 2025-12-10T14:44:42 1765377882

That's a good point - recursive descent as a general lesson in program design, in addition to being a good way to write a parser.

Table driven parsers (using yacc/etc) used to be emphasized in old compiler writing books such as Aho & Ullman's famous "dragon (front cover) book". I'm not sure why - maybe part efficiency for the slower computers of the day, and part because in the infancy of computing a more theoretical/algorithmic approach seemed more sophisticated and preferable (the cannonical table driven parser building algorithm was one of Knuth's algorithms).

Nowadays it seems that recursive descent is the preferred approach for compilers because it's ultimately more practical and flexible. Table driven can still be a good option for small DSLs and simple parsing tasks, but recursive descent is so easy that it's hard to justify anything else, and LLM code generation now makes that truer than ever!

There is a huge difference in complexity between building a full-blown commercial quality optimizing compiler and a toy one built as a learning exercise. Using something like LLVM as a starting point for a learning exercise doesn't seem very useful (unless your goal is to build real compilers) since it's doing all the heavy lifting for you.

I guess you can argue about how much can be cut out of a toy compiler for it still to be a useful learning exercise in both compilers and tackling complex problems, but I don't see any harm in going straight from parsing to code generation, cutting out AST building and of course any IR and optimization. The problems this direct approach causes for code generation, and optimization, can be a learning lesson for why a non-toy compiler uses those!

A fun approach I used at work once, wanting to support a pretty major C subset as the language supported by a programmable regression test tool, was even simpler ... Rather than having the recursive descent parser generate code, I just had it generate executable data structures - subclasses of Statement and Expression base classes, with virtual Execute() and Value() methods respectively, so that the parsed program could be run by calling program->Execute() on the top level object. The recursive descent functions just returned these statement or expression values directly. To give a flavor of it, the ForLoopStatement subclass held the initialization, test and increment expression class pointers, and then the ForLoopStatement::Execute() method could just call testExpression->Value() etc.

HarHarVeryFunny · 2025-12-09T19:57:53 1765310273

Peer coding?

Maybe common usage is shifting, but Karpathy's "vibe coding" was definitely meant to be a never look at the code, just feel the AI vibes thing.

HarHarVeryFunny · 2025-12-09T19:16:42 1765307802

I think that's what makes it funny - the future turns out to be just as dismal and predictable as we expect it to be. Google kills Gemini, etc.

Humor isn't exactly a strong point of LLMs, but here it's tapped into the formulaic hive mind of HN, and it works as humor!

HarHarVeryFunny · 2025-12-09T18:21:21 1765304481

Because the LLM has presumably been trained on more React than WASM, and will do a better job of it.

ya filthy animal!

HarHarVeryFunny · 2025-12-09T18:02:35 1765303355

Obviously right now the best language to use LLMs for, vibe coding or not, is whatever they are most familiar with, although not sure what this actually is! Java?

Going forwards, when LLMs / coding tools are able to learn new languages, then languages designed for machines vs humans certainly makes sense.

Languages designed for robust error detection and checking, etc. Prefer verbosity where it adds information rather than succintness. Static typing vs dynamic. Contractual specification of function input/output guarantees. Modular/localized design.

It's largely the same considerations that make a language good for large team, large code base projects, opposite end of the spectrum to scripting languages, except that if it's machine generated you can really go to town on adding as much verbosity is needed to tighten the specification and catch bugs at compile time vs runtime.

DonHopkins · 2025-12-09T20:53:50 1765313630

Great point, except for one huge insurmountable non-technical problem with Java that can be invoked in a single word: lawnmower.

“Do not fall into the trap of anthropomorphizing Larry Ellison. You need to think of Larry Ellison the way you think of a lawnmower. You don’t anthropomorphize your lawnmower, the lawnmower just mows the lawn, you stick your hand in there and it’ll chop it off, the end. You don’t think ‘oh, the lawnmower hates me’ — lawnmower doesn’t give a shit about you, lawnmower can’t hate you. Don’t anthropomorphize the lawnmower. Don’t fall into that trap about Oracle.” -Bryan Cantrill

“I actually think that it does a dis-service to not go to Nazi allegory because if I don’t use Nazi allegory when referring to Oracle there’s some critical understanding that I have left on the table […] in fact as I have said before I emphatically believe that if you have to explain the Nazis to someone who had never heard of World War 2 but was an Oracle customer there’s a very good chance that you would explain the Nazis in Oracle allegory.” -Bryan Cantrill

https://www.youtube.com/watch?v=-zRN7XLCRhc

Let's please not turn over the future of AI and programming languages over to a lawnmower.

HarHarVeryFunny · 2025-12-08T18:47:59 1765219679

> Big corporate AI products are all currently stupid bolt-ons that some committee decided solved a problem.

Or not even .. maybe someone said all products need to be AI enabled, so now they are. Just append "AI" to the product name, add bolt-on to call an LLM to do something, and declare mission accomplished.

HarHarVeryFunny · 2025-12-08T18:41:46 1765219306

Sure - US has 1M developers vs 300M pop. At least the Claude Code developers are paying for it though, vs only 5% of users paying for ChatGPT.

HarHarVeryFunny · 2025-12-08T18:35:09 1765218909

Sounds like Siri - unable to control much of anything on the iPhone outside of reading/sending text messages and setting alarms.

microtherion · 2025-12-08T20:22:51 1765225371

It also can initiate phone calls and play music.

And those are really the core use cases for AirPods, HomePods, and CarPlay, the contexts where a hands-free eyes-free interface is most useful.

donkey_brains · 2025-12-08T18:56:26 1765220186

And that’s ok. Those are core features of the phone that absolutely must work reliably and consistently. Far better to do a few important things really well than a hundred things execrably.

throw310822 · 2025-12-08T19:00:14 1765220414

However the other day I asked the Gemini assistant on my phone to check the birthdays in my calendar, get all their dates, then make a graph of how many fall in each period with a 15-day moving average. It did everything as instructed including writing a python script to generate the graph, then discussed the results with me :)

HarHarVeryFunny · 2025-12-08T19:12:18 1765221138

I would expect Siri to be able to do anything on the iPhone that I can - change settings, report stats, kill/launch apps, etc.

It would be nice if it could control 3rd party apps too, like GMail, but being able to control the stuff that Apple themselves have built doesn't seem a lot to ask.

array_key_first · 2025-12-09T01:48:43 1765244923

Apple is somewhat privacy focused, allegedly. I would imagine that letting an unpredictable LLM read whatever system data it wants is not something they want to partake in.

HarHarVeryFunny · 2025-12-09T15:01:21 1765292481

Maybe there are different concerns now that Apple are apparently going to be using Google LLMs for Siri, although I'd have thought that any access to system settings or other iOS functionality would be via tool calls with user opt-in.

However, the original Siri, obtained from SRI ("Siri" = SRI) was pre-LLMs, and there should have been no more concern over accessing system settings or controlling apps than things like reading/sending messages. For some reason Apply completely dropped the ball with Siri; initially it was expected that they would expand it into revenue generating areas like restaurant reservations etc, but then nothing. I'm not sure what documentation exists, but even today it's not clear what Siri is actually capable of.

zaphirplane · 2025-12-09T07:23:49 1765265029

We all know they Apple lack the capability, not a choice.

HarHarVeryFunny · 2025-12-08T13:20:30 1765200030

The factuality problem with LLMs isn't because they are non-deterministic or statistically based, but simply because they operate at the level of words, not facts. They are language models.

You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place. All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C". The LLM wasn't designed to know or care that outputting "B" would represent a lie or hallucination, just that it's a statistically plausible potential next word.

biophysboy · 2025-12-08T16:31:12 1765211472

I think this is why I get much more utility out of LLMs with writing code. Code can fail if the syntax is wrong; small perturbations in the text (e.g. add a newline instead of a semicolon) can lead to significant increases in the cost function.

Of course, once an LLM is asked to create a bespoke software project for some complex system, this predictability goes away, the trajectory of the tokens succumbs to the intrinsic chaos of code over multi-block length scales, and the result feels more arbitrary and unsatisfying.

I also think this is why the biggest evangelists for LLMs are programmers, while creative writers and journalists are much more dismissive. With human language, the length scale over which tokens can be predicted is much shorter. Even the "laws" of grammar can be twisted or ignored entirely. A writer picks a metaphor because of their individual reading/life experience, not because its the most probable or popular metaphor. This is why LLM writing is so tedious, anodyne, sycophantic, and boring. It sounds like marketing copy because the attention model and RL-HF encourage it.

coldtea · 2025-12-08T15:40:02 1765208402

>but simply because they operate at the level of words, not facts. They are language models.

Facts can be encoded as words. That's something we also do a lot for facts we learn, gather, and convey to other people. 99% of university is learning facts and theories and concept from reading and listening to words.

Also, even when directly observing the same fact, it can be interpreted by different people in different ways, whether this happens as raw "thought" or at the conscious verbal level. And that's before we even add value judgements to it.

>All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C".

And how do we know we don't do something very similar with our facts - make a map of facts and concepts and weights between them for retrieving them and associating them? Even encoding in a similar way what we think of as our "analytic understanding".

HarHarVeryFunny · 2025-12-08T15:55:56 1765209356

Animal/human brains and LLMs have fundamentally different goals (or loss functions, if you prefer), even though both are based around prediction.

LLMs are trained to auto-regressively predict text continuations. They are not concerned with the external world and any objective experimentally verifiable facts - they are just self-predicting "this is what I'm going to say next", having learnt that from the training data (i.e. "what would the training data say next").

Humans/animals are embodied, living in the real world, whose design has been honed by a "loss function" favoring survival. Animals are "designed" to learn facts about the real world, and react to those facts in a way that helps them survive.

What humans/animals are predicting is not some auto-regressive "what will I do next", but rather what will HAPPEN next, based largely on outward-looking sensory inputs, but also internal inputs.

Animals are predicting something EXTERNAL (facts) vs LLMs predicting something INTERNAL (what will I say next).

coldtea · 2025-12-08T17:06:05 1765213565

>Humans/animals are embodied, living in the real world, whose design has been honed by a "loss function" favoring survival. Animals are "designed" to learn facts about the real world, and react to those facts in a way that helps them survive.

Yes - but LLMs also get this "embodied knowledge" passed down from human-generated training data. We are their sensory inputs in a way (which includes their training images, audio, and video too).

They do learn in a batch manner, and we learn many things not from books but from a more interactive direct being in the world. But after we distill our direct experiences and throughts derived from them as text, we pass them down to the LLMs.

Hey, there's even some kind of "loss function" in the LLM case - from the thumbs up/down feedback we are asked to give to their answers in Chat UIs, to $5/hour "mechanical turks" in Africa or something tasked with scoring their output, to rounds of optimization and pruning during training.

>Animals are predicting something EXTERNAL (facts) vs LLMs predicting something INTERNAL (what will I say next).

I don't think that matters much, in both cases it's information in, information out.

Human animals predict "what they will say/do next" all the time, just like they also predict what they will encounter next ("my house is round that corner", "that car is going to make a turn").

Our prompt to an LLM serves the same role as sensory input from the external world plays to our predictions.

HarHarVeryFunny · 2025-12-08T17:52:33 1765216353

> Yes - but LLMs also get this "embodied knowledge" passed down from human-generated training data.

It's not the same though. It's the difference between reading about something and, maybe having read the book and/or watched the video, learning to DO it yourself, acting based on the content of your own mind.

The LLM learns 2nd hand heresay, with no idea of what's true or false, what generalizations are valid, or what would be hallucinatory, etc, etc.

The human learns verifiable facts, uses curiosity to explore and fill the gaps, be creative etc.

I think it's pretty obvious why LLMs have all the limitations and deficiencies that they do.

If 2nd hand heresay (from 1000's of conflicting sources) really was good as 1st hand experience and real-world prediction, then we'd not be having this discussion - we'd be bowing to our AGI overlords (well, at least once the AI also got real-time incremental learning, internal memory, looping, some type of (virtual?) embodiment, autonomy ...).

zby · 2025-12-08T21:38:31 1765229911

"The LLM learns 2nd hand heresay, with no idea of what's true or false, what generalizations are valid, or what would be hallucinatory, " - do you know what is true and what is false? Take this: https://upload.wikimedia.org/wikipedia/commons/thumb/b/be/Ch... - Do you believe your eyes or do you believe the text about it?

HarHarVeryFunny · 2025-12-09T00:36:48 1765240608

I can experiment and verify, can't I ?

coldtea · 2025-12-09T09:49:49 1765273789

Do you? Do most? Do we for 99.999% of stuff we're taught?

Besides, the LLM can also "experiment and verify" some things now. E.g. it can spin up Python and run a script to verify some answers.

HarHarVeryFunny · 2025-12-09T15:18:50 1765293530

I think if we're considering the nature of intelligence, pursuant to trying to replicate it, then the focus needs to be more evolutionary and functional, not the behavior of lazy modern humans who can get most of their survival needs met at Walmart or Amazon!

The way that animals (maybe think apes and dogs, etc, not just humans) learn is by observing and interacting. If something is new or behaves in unexpected ways then "prediction failure", aka surprise, leads to them focusing on it and interacting with it, which is the way evolution has discovered for them to learn more about it.

Yes, an LLM has some agency via tool use, and via tool output it can learn/verify to some extent, although without continual learning this is only of ephemeral value.

This is all a bit off topic to my original point though, which is the distinction between trying to learn from 2nd hand conflicting heresay (he said, she said) vs having the ability to learn the truth for yourself, which starts with being built to predict the truth (external real-world) rather than being built to predict statistical "he said, she said" continuations. Sure, you can mitigate a few of an LLM's shortcomings by giving them tools etc, but fundamentally they are just doing the wrong thing (self-prediction) if you are hoping for them to become AGI rather than just being language models.

Forgeties79 · 2025-12-08T13:50:18 1765201818

> You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place

On one level I agree, but I do feel it’s also right to blame the LLM/company for that when the goal is to replace my search engine of choice (my major tool for finding facts and answering general questions), which is a huge pillar of how they’re sold to/used by the public.

HarHarVeryFunny · 2025-12-08T15:05:46 1765206346

True, although that's a tough call for a company like Google.

Even before LLMs people were asking Google search questions rather than looking for keyword matches, and now coupled with ChatGPT it's not surprising that people are asking the computer to answer questions and seeing this as a replacement for search. I've got to wonder how the typical non-techie user internalizes the difference between asking questions of Google (non-AI mode) and asking ChatGPT?

Clearly people asking ChatGPT instead of Google could rapidly eat Google's lunch, so we're now getting "AI overview" alongside search results as an attempt to mitigate this.

I think the more fundamental problem is not just the blurring of search vs "AI", but these companies pushing "AI" (LLMs) as some kind of super-human intelligence (leading to uses assuming it's logical and infallible), rather than more honestly presenting it as what it is.

Forgeties79 · 2025-12-08T19:44:28 1765223068

Yeah I pretty much agree with everything you’ve got here

georgemcbay · 2025-12-08T15:56:05 1765209365

> Even before LLMs people were asking Google search questions rather than looking for keyword matches

Google gets some of the blame for this by way of how useless Google search became for doing keyword searches over the years. Keyword searches have been terrible for many years, even if you use all the old tricks like quotations and specific operators.

Even if the reason for this is because non-tech people were already trying to use Google in the way that it thinks it optimized for, I'd argue they could have done a better job keeping things working well with keyword searches by training the user with better UI/UX.

(Though at the end of the day, I subscribe to the theory that Google let search get bad for everyone on purpose because once you have monopoly status you show more ads by having a not-great but better-than-nothing search engine than a great one).

AlecSchueler · 2025-12-08T13:55:14 1765202114

In a way though those things aren't so different as they might first appear. The factual answer is traditionally the most plausible response to many questions. They don't operate on any level other than pure language but there are a heap of behaviours which emerge from that.

psychoslave · 2025-12-08T14:21:29 1765203689

Most plausible world model is not something stored raw in utterances. What we interpret from sentences is vastly different from what is extractable from mere sentences on their own.

Facts, unlike fabulations, require crossing experience beyond the expressions on trial.

HarHarVeryFunny · 2025-12-08T14:32:53 1765204373

Right, facts need to be grounded and obtained from reliable sources such as personal experience, or a textbook. Just because statistically most people on Reddit or 4Chan said the moon is made of cheese doesn't make it so.

But again, LLMs don't even deal in facts, nor store any memories of where training samples came from, and of course have zero personal experience. It's just "he said, she said" put into a training sample blender and served one word at a time.

HarHarVeryFunny · 2025-12-08T14:27:40 1765204060

> The factual answer is traditionally the most plausible response to many questions

Except in cases where the training data is more wrong than correct (e.g. niche expertise where the vox pop is wrong).

However, an LLM no more deals in Q&A than in facts. It only typically replies to a question with an answer because that itself is statistically most likely, and the words of the answer are just selected one at a time in normal LLM fashion. It's not regurgitating an entire, hopefully correct, answer from someplace, so just because it was exposed to the "correct" answer in the training data, maybe multiple times, doesn't mean that's what it's going to generate.

In the case of hallucination, it's not a matter of being wrong, just the expected behavior of something built to follow patterns rather than deal in and recall facts.

For example, last night I was trying to find an old auction catalog from a particular company and year, so thought I'd try to see if Gemini 3 Pro "Thinking" maybe had the google-fu to find it available online. After the typical confident sounding "Analysing, Researching, Clarifying .." "thinking", it then confidently tells me it has found it, and to go to website X, section Y, and search for the company and year.

Not surprisingly it was not there, even though other catalogs were. It had evidently been trained on data including such requests, maybe did some RAG and got more similar results, then just output the common pattern it had found, and "lied" about having actually found it since that is what humans in the training/inference data said when they had been successful (searching for different catalogs).

coldtea · 2025-12-08T15:45:13 1765208713

>Except in cases where the training data is more wrong than correct (e.g. niche expertise where the vox pop is wrong)

Same for human knowledge though. Learn from society/school/etc that X is Y, and you repeat X is Y, even if it's not.

>However, an LLM no more deals in Q&A than in facts. It only typically replies to a question with an answer because that itself is statistically most likely, and the words of the answer are just selected one at a time in normal LLM fashion.

And how is that different than how we build up an answer? Do we have a "correct facts" repository with fixed answers to every possibly question, or we just assemble our training data from a weighted graph (or holographic) store of factoids and memories, and our answers are also non deterministic?

HarHarVeryFunny · 2025-12-08T16:04:41 1765209881

We likely learn/generate language in an auto-regressive way at least conceptually similar to an LLM, but this isn't just self-contained auto-regressive generation...

Humans use language to express something (facts, thoughts, etc), so you can consider these thoughts being expressed as a bias to the language generation process, similar perhaps to an image being used as a bias to the captioning part of an image captioning model, or language as a bias to an image generation model.

coldtea · 2025-12-08T16:59:51 1765213191

>Humans use language to express something (facts, thoughts, etc), so you can consider these thoughts being expressed as a bias to the language generation process

My point however is more that the "thoughts being expressed" are themselves being generated by a similar process (and that it's either that or a God-given soul).

HarHarVeryFunny · 2025-12-08T17:27:49 1765214869

Similar in the sense of being mechanical (no homunculus or soul!) and predictive, but different in terms of what's being predicted (auto-regressive vs external).

So, with the LLM all you have is the auto-regressive language prediction loop.

With animals you primarily have the external "what happens next" prediction loop, with these external-world fact-based predictions presumably also the basis of their thoughts (planning/reasoning), as well as behavior.

If it's a human animal who has learned language, then you additionally have an LLM-like auto-regressive language prediction loop, but now, unlike the LLM, biased (controlled) by these fact-based thoughts (as well as language-based thoughts).

thaumasiotes · 2025-12-08T15:28:37 1765207717

If you want to see how well text generation works on unrehearsed questions, you can ask about what happens in a comic strip.

I found commentary about searching Google for "dark legacy comics who wants some bamboo", and posted results for that search on HN in response to a comment saying "I work at Google on the 'AI Mode' search option, check it out!" ( https://news.ycombinator.com/item?id=45609275 )

And... a few days later, Google stopped automatically including an AI overview when you made that specific search.

But it turns out there are other comics on the internet. So just now I searched for "YAFGC comic 2230" (without the quotes). This is actually a search that makes a lot of sense, because the archive page for the comic is broken and it's impossible to guess the URL of a comic if all you know is the number.

So the goal here is really to find a link to comic 2230. But here's what the AI overview provides:

----- Italics represent original bolding -----

YAFGC comic #2230, titled "You Want It When?", was published on July 17, 2012.

The single-panel comic depicts a conversation at a blacksmith's shop, featuring characters Glon Smitharm/Bloodhand and the Cyclops Bros. The dialogue centers on the unrealistic expectations for custom-made magical items:

- Glon (a half-orc) asks the cyclopes if they can forge a new weapon for him.

- The Cyclops Bros reply with questions about the desired properties of the item (e.g., "enchanted," "plus three") and the required timeframe.

- Glon gives a deadline of "Tuesday," leading the cyclopes to express disbelief at the short notice for such a powerful item.

You can view the comic directly on the official website via this link:

- YAFGC Comic 2230: You Want It When?

----------

(It may look like I've left out a link at the end. That is not the case. The answer ends by saying "you can view the comic directly via this link", in reference to some bold text that includes no link.)

However, I have left out a link from near the beginning. The sentence "The dialogue centers on the unrealistic expectations for custom-made magical items:" is accompanied by a citation to the URL https://www.yafgc.net/comic/2030-insidiously-involved/ , which is a comic that does feature Glon Smitharm/Bloodhand and Ray the Cyclops, but otherwise does not match the description and which is comic 2030 ("Insidiously Involved"), not comic 2230.

The supporting links also include a link to comic 2200 (for no good reason), and that's close enough to 2230 that I was able to navigate there manually. Here it is: https://www.yafgc.net/comic/2230-clover-nabs-her-a-goldie/

You might notice that the AI overview got the link, the date, the title, the appearing characters, the theme, and the dialog wrong.

----- postscript -----

As a bonus comic search, searching for "wow dark legacy 500" got this response from Google's AI Overview:

> Dark Legacy Comic #500 is titled "The Game," a single-panel comic released on June 18, 2015. It features the main characters sitting around a table playing a physical board game, with Keydar remarking that the in-game action has gotten "so realistic lately."

> You can view the comic and its commentary on the official Dark Legacy Comics website. [link]

Compare https://darklegacycomics.com/500 .

That [link] following "the official Dark Legacy Comics website" goes to https://wowwiki-archive.fandom.com/wiki/Dark_Legacy_Comics , by the way.

toddmorey · 2025-12-08T13:32:44 1765200764

Yeah, that’s very well put. They don’t store black-and-white they store billions of grays. This is why tool use for research and grounding has been so transformative.

therealpygon · 2025-12-08T14:20:55 1765203655

Definitely, and hence the reason that structuring requests/responses and providing examples for smaller atomic units of work seem to have quite a significant effect on the accuracy of the output (not factuality, but more accurate to the patterns that were emphasized in the preceding prompt).

I just wish we could more efficiently ”prime” a pre-defined latent context window instead of hoping for cache hits.

wisty · 2025-12-08T13:44:47 1765201487

I think they are much smarter than that. Or will be soon.

But they are like a smart student trying to get a good grade (that's how they are trained!). They'll agree with us even if they think we're stupid, because that gets them better grades, and grades are all they care about.

Even if they are (or become) smart enough to know better, they don't care about you. They do what they were trained to do. They are becoming like a literal genie that has been told to tell us what we want to hear. And sometimes, we don't need to hear what we want to hear.

"What an insightful price of code! Using that API is the perfect way to efficiently process data. You have really highlighted the key point."

The problem is that chatbots are trained to do what we want, and most of us would rather have a syncophant who tells us we're right.

The real danger with AI isn't that it doesn't get smart, it's that it gets smart enough to find the ultimate weakness in its training function - humanity.

HarHarVeryFunny · 2025-12-08T13:59:10 1765202350

> I think they are much smarter than that. Or will be soon.

It's not a matter of how smart they are (or appear), or how much smarter they may become - this is just the fundamental nature of Transformer-based LLMs and how they are trained.

The sycophantic personality is mostly unrelated to this. Maybe it's part human preference (conferred via RLHF training), but the "You're asbolutely right! (I was wrong)" is clearly deliberately trained, presumably as someone's idea of the best way to put lipstick on the pig.

You could imagine an expert system, CYC perhaps, that does deal in facts (not words) with a natural language interface, but still had a sycophantic personality just because someone thought it was a good idea.

wisty · 2025-12-08T14:45:58 1765205158

Sorry, double reply, I reread your comment and realised you probably know what you're talking about.

Yeah, at its heart it's basically text compression. But the best way to compression, say, Wikipedia would be to know how the world works, at least according to the authors. As the recent popular "bag of words" post says:

> Here’s one way to think about it: if there had been enough text to train an LLM in 1600, would it have scooped Galileo? My guess is no. Ask that early modern ChatGPT whether the Earth moves and it will helpfully tell you that experts have considered the possibility and ruled it out. And that’s by design. If it had started claiming that our planet is zooming through space at 67,000mph, its dutiful human trainers would have punished it: “Bad computer!! Stop hallucinating!!”

So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.

And as the author (grudgingly) admits, even if it's smart enough to know better, it will still be trained or fine tuned to tell us what we want to hear.

I'd go a step further - the end point is an AI that knows the currently accepted facts, and can internally reason about how many of them (subject to available evidence) are wrong, but will still tell us what we want to hear.

At some point maybe some researcher will find a secret internal "don't tell the stupid humans this" weight, flip it, and find out all the things the AI knows we don't want to hear, that would be funny (or maybe not).

HarHarVeryFunny · 2025-12-08T15:24:29 1765207469

> So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.

It's not a compression engine - it's just a statistical predictor.

Would it do better if it was incentivized to compress (i.e training loss rewarded compression as well as penalizing next-word errors)? I doubt it would make a lot of difference - presumably it'd end up throwing away the less frequently occurring "outlier" data in favor of keeping what was more common, but that would result in it throwing away the rare expert opinion in favor of retaining the incorrect vox pop.

wisty · 2025-12-08T21:24:50 1765229090

Both compression engines and llm work by assigning scores to the next token. If you can guess the probability distribution of the next token you have a near perfect text compressor, and a near perfect llm. Yeah in the real world they have different trade-offs.

Here's a paper by deep mind. https://arxiv.org/pd7f/2309.10668 - titled LANGUAGE MODELING IS COMPRESSION

HarHarVeryFunny · 2025-12-09T15:37:17 1765294637

An LLM is a transformer of a specific size (number of layers, context width, etc), and ultimately number of parameters. A trillion parameter LLM is going to use all trillion parameters regardless of whether you train it on 100 samples or billions of them.

Neural nets, including transformers, learn by gradient descent, according to the error feedback (loss function) they are given. There is no magic happening. The only thing the neural net is optimizing for is minimizing errors on the loss function you give it. If the loss function is next-token error (as it is), then that is ALL it is optimizing for - you can philosophize about what they are doing under the hood, and write papers about that ("we advocate for viewing the prediction problem through the lens of compression"), but at the end of the day it is only pursuant to minimizing the loss. If you want to encourage compression, then you would need to give an incentive for that (change the loss function).

wisty · 2025-12-08T14:17:31 1765203451

I'm not sure what you mean by "deals in facts, not words" means.

Llm deal in vectors internally, not words. They explode the word into a multidimensional representation, and collapse it again, and apply the attention thingy to link these vectors together. It's not just a simple n:n Markov chain, a lot is happening under the hood.

And are you saying the syncophant behaviour was deliberately programmed, or emerged because it did well in training?

tovej · 2025-12-08T14:39:16 1765204756

If you're not sure, maybe you should look up the term "expert system"?

wisty · 2025-12-08T21:54:22 1765230862

It was a polite way of saying "that's kinda bull".

And yes, I know what an expert system is.

Do you know that a neural network (or set of matrices, same thing really) can approximate anything else? https://en.wikipedia.org/wiki/Universal_approximation_theore...

How do you know that inside the black box, they don't approximate expert systems?

tovej · 2025-12-09T07:56:04 1765266964

I'm not sure you do, because expert systems are constraint solvers and LLMs are not. They literally deal in encoded facts, which is what the original comment was about.

The universal approximation theorem is not relevant. You would first have to try to train the neural network to approximate a constraint solver (that's not the case with LLMs), and in practice, these kinds of systems are exactly the ones that a neural network is bad at.

The universal approximation theory says nothing about feasibility, it only talks about theoretical existence as a mathematical object, not whether the object can actually be created in the real world.

I'll remind you that the expert system would have to have been created and updated by humans. It would have had to have been created before a neural network was applied to it in the first place.

HarHarVeryFunny · 2025-12-08T14:43:55 1765205035

LLMs are not like an expert system representing facts as some sort of ontological graph. What's happening under the hood is just whatever (and no more) was needed to minimize errors on it's word-based training loss.

I assume the sycophantic behavior is part because it "did well" during RLHF (human preference) training, and part deliberately encouraged (by training and/or prompting) as someone's judgement call of the way to best make the user happy and own up to being wrong ("You're absolutely right!").

wisty · 2025-12-08T22:03:48 1765231428

It needs something mathematically equivalent (or approximately the same), under the hood, to guess the next word effectively.

We are just meat eating bags of meat, but to do our job better we needed to evolve intelligence. A word guessing bag of words also needs to evolve intelligence and a world model (albeit an impicit hidden one) to do its job well, and is optimised towards this.

And yes, it also gets fine trained. And either its world model is corrupted by our mistakes (both in trining and fine tuning), or even more disturbingly it simplicity might (in theory) figue out one day (in training, impicitly - and yes it doesn't really think the way we do) something like "huh, the universe is actually easier to predict if it is modelled as alphabet spaghetti, not quantum waves, but my training function says not to mention this".

TheOtherHobbes · 2025-12-08T18:33:33 1765218813

It's worse than that. LLMs are slightly addictive because of intermittent reinforcement.

If they give you nonsense most of the time and an amazing answer occasionally you'll bond with them far more strongly than if they're perfectly correct all time.

Selective reinforcement means you get hooked more quickly if the slot machine pays out once every five times than if it pays out on each spin.

That includes "That didn't work because..." debugging loops.