LLMs are orders of magnitude simpler than brains, and we literally designed them from scratch. Also, we have full control over their operation and we can trace every signal.
Are you surprised we understand them better than brains?
We've been studying brains a lot longer. LLMs are grown, not built. The part that is designed are the low-level architecture - but what it builds from that is incomprehensible and unplanned.
LLMs draw origins from, both n-gram language models (ca. 1990s) and neural networks and deep learning (ca. 2000). So we've only had really good ones maybe 6-8 years or so, but the roots of the study go back 30 years at least.
Psychiatry, psychology, and neurology on the other hand, are really only roughly 150 years old. Before that, there wasn't enough information about the human body to be able to study it, let alone the resources or biochemical knowledge necessary to be able to understand it or do much of anything with it.
So, sure, we've studied it longer. But only 5 times longer. And, I mean, we've studied language, geometry, and reasoning for literally thousands of years. Markov chains are like 120 years old, so older than computer science, and you need those to make an LLM.
And if you think we went down some dead-end directions with language models in the last 30 years, boy, have I got some bad news for you about how badly we botched psychiatry, psychology, and neurology!
Embedding „meaning“ in vector spaces goes back to 1950s structuralist linguistics and early information retrieval research, there is a nice overview in the draft for the 3rd edition of speech and language processing https://web.stanford.edu/~jurafsky/slp3/5.pdf
You are still talking about low level infrastructure. This is like studying neurons only from a cellular biology perspective and then trying to understand language acquisition in children. It is very clear from recent literature that the emergent structure and behavior of LLMs is absolutely a new research field.
"Designed" is a bit strong. We "literally" couldn't design programs to do the interesting things LLMs can do. So we gave a giant for loop a bunch of data and a bunch of parameterized math functions and just kept updating the parameters until we got something we liked.... even on the architecture (ie, what math functions) people are just trying stuff and seeing if it works.
> We "literally" couldn't design programs to do the interesting things LLMs can do.
That's a bit of an overstatement.
The entire field of ML is aimed at problems where deterministic code would work just fine, but the amount of cases it would need to cover is too large to be practical (note, this has nothing to do with the impossibility of its design) AND there's a sufficient corpus of data that allows plausible enough models to be trained. So we accept the occasionally questionable precision of ML models over the huge time and money costs of engineering these kinds of systems the traditional way. LLMs are no different.
Saying ML is a field where deterministic code would work just fine conveniently leaves out the difficult part - writing the actual code.... Which we haven't been able to do for most of the tasks at hand.
It is impossible to design even in a theoretical sense if functional requirements consider matters such as performance and energy consumption. If you have to write petabytes of code you also have to store and execute it.
I'm a psychiatry resident who has been into ML since... at least 2017. I even contemplated leaving medicine for it in 2022 and studied for that, before realizing that I'd never become employable (because I could already tell the models were getting faster than I am).
You would be sorely mistaken to think I'm utterly uninformed about LLM-research, even if I would never dare to claim to be a domain expert.
Are you surprised we understand them better than brains?