Is it possible that the coffee drinkers have more social interaction with the barista and others? It's unclear from the paper if they eliminated the confounding factors around coffee drinking.
What other social interactions are needed more than: "One flat white to go, please", and "Thank you"? Asking genuinely, because I don't know what else I can say.
I usually make coffee at home, but the baristas are remarkably stable in my area. When I do go to a coffee shop (there are 2-3 that I might go to) there’s a good chance I’ll recognize the barista and that they’ll also remember me. In one such case I’ve been seeing the same one for close to a decade, and we always chat for a bit.
I think most baristas who do it for more than a year or two learn to not primarily be a coffee factory but first to make a positive impact on the people they see. The coffee is something that can be made consistent (and in a way, boring) through practice, but personal connection, especially when it is genuine, has a real draw.
Lots of things. “Could I have some sugar, please; two frappy mochachos? one with almond milk; can you explain what all these options are, please; what the hell is mushroom powder?” In today’s coffee shops this can lead to hours of complex social interaction at the counter, enriching our lives and ultimately extending our lifespans. — sorry, couldn’t resist. In seriousness, I actually find this conversation interesting. Some coffee shops do have quite a social culture around them, though I think they’re outliers on whole. Here in Spain it’s a mix, but in some it is like everyone’s friends with the barista.
Most programmers don't understand the low level assembly or machine code. High level language becomes the layer where human comprehension and collaboration happens.
LLM is pushing that layer towards natural language and spec-driven development. The only *big* difference is that high level programming languages are still deterministic but natural language is not.
I'm guessing we've reached an irreducible point where the amount of information needed specify the behavior of a program is nearly optimally represented in programming languages after decades of evolution. More abstraction into the natural language realm would make it lossy. And less abstraction down to the low level code would make it verbose.
The difference is not just a jump to a higher abstraction with natural language. It's something fundamentally differet.
The previous tools (assemblers, compilers, frameworks) were built on hard-coded logic that can be checked and even mathematically verified. So you could trust what you're standing on. But with LLMs we jump off the safely-built tower into a world of uncertainty, guesses, and hallucinations.
If LLMs still produce code that is eventually compiled down to a very low level...that would mean it can be checked and verified, the process just has additional steps.
JavaScript has a ton of behavior that is very uncertain at times and I'm sure many JS developers would agree that trusting what you're standing on is at times difficult. There is also a large percentage of developers that don't mathematically verify their code, so the verification is kind of moot in those cases, hence bugs.
The current world of LLM code generation lacks the verification you are looking for, however I am guessing that these tools will soon emerge in the market. For now, building as incrementally as possible and having good tests seems to be a decent path forward.
There are 4 important components to describing a compiler. The source language, the target language, and the meaning (semantics in compiler-speak) of both those languages.
We call a C->asm compiler "correct" if the meaning of every valid C program turns into an assembly program with equivalent meaning.
The reason LLMs don't work like other compilers is not that they're non-deterministic, it's that the source language is ambiguous.
LLMs can never be "correct" compilers, because there's no definite meaning assigned to english. Even if english had precise meaning, LLMs will never be able to accurately turn any arbitary english description into a C program.
Imagine how painful development would be if compilers produced incorrect assembly for 1% of all inputs.
English does have precise meaning, if constructed to be precise, the issue is that LLMs do not assign meaning in the way humans assign meaning. Humans assign English meaning to code every day just fine, and sometimes it does result in bugs as well.
The LLM in this loop is the equivalent of a human, which also has ambiguous source language if we’re going by your theory of English being ambiguous. So it sounds like you’re saying that if a human produces a C program, it is not verifiable and testable because the human used an ambiguous source language?
I guess for some reason people thought I meant that the compiler would be LLM > machine code, where actually I meant the compiler would still be whatever language the LLM produces down to machine code. Its just that the language the LLM produces can be checked through things like TDD or a human, etc...
I would probably agree! I came off sounding as if there is no human in the loop. What I meant is that input is still the programming language that is produced and output is the result. Not that the LLM is the initial input. A human in the loop can clean the code produced or create tests that check for an end result(or intermediate results as well).
I understand that an input to an LLM will create a different result in many cases, making the output not deterministic, but that doesn’t mean we can’t use probability to arrive to results eventually.
I mean the things _producing_ the code can be checked and verified, meaning the code generated is guaranteed to be correct. You're talking about verifying the code _produced_. That's the big difference.
Would be curious as to how you check and verify LLMs? And how you get guaranteed correct code?
Verifying code produced is a much simpler task for some code because I, as a human, can look at a generated snippet and reason about it and determine if it is what I want. I can also create tests to say “does this code have this effect on some variable” and then proceed to run the test.
> Most programmers don't understand the low level assembly or machine code.
Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale. They don't understand the primary API to the browser, the DOM, at all and many don't understand the Node API outside the browser.
For an outside observer it really begs the Office Space question: What would you say you do here? Its weird trying to explain it to people completely outside software. For the rest of us in software we are so used to this we take the insanity for granted as an inescapable reality.
Ironically, at least in the terms of your comment, is that when you confront JavaScript developers about this lack of fundamental knowledge comparisons to assembly frequently come up. As though writing JavaScript directly is somehow equivalent to writing machine code, but for many people in that line of work they are equivalent distant realities.
The introduction of LLMs makes complete sense. When nobody knows how any of this code works then there isn't a harm to letting a machine write it for you, because there isn't a difference in the underlying awareness.
> Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale.
Although I'm sure you are correct, I would also want to mention that most programmers that write JavaScript for a living aren't working for Meta or Alphabet or other companies that need to scale to billions, or even millions, of users. Most people writing JavaScript code are, realistically, going to have fewer than ten thousand users for their apps. Either because those apps are for internal use at their company (such as my current project, where at most the app is going to be used by 200-250 people, so although I do understand data structures I'm allowing myself to do O(N^2) business logic if it simplifies the code, because at most I need to handle 5-6 requests per minute), or else because their apps are never going to take off and get the millions of hits that they're hoping for.
If you don't need to scale, optimizing for programmer convenience is actually a good bet early on, as it tends to reduce the number of bugs. Scaling can be done later. Now, I don't mean that you should never even consider scaling: design your architecture so that it doesn't completely prevent you from scaling later on, for example. But thinking about scale should be done second. Fix bugs first, scale once you know you need to. Because a lot of the time, You Ain't Gonna Need It.
Your reasoning is absolutely correct. But different developers have different baselines of code quality so some developers even doing sloppy work will code faster, safer software than others trying their best only to create the crappiest code (that sometimes works too well, makes a lot of money and needs to be fixed, but that usually means success).
A side effect of the non-deterministic behaviour is that, unlike previous increases in abstraction, the high level prompts are not checked in to the code base and available to recreate their low level output on demand. Instead we commit the lower level output (ie code) and future revisions must operate on this output without the ability to modify the original high level instructions.
I feel like natural language specs can play a role, but there should be an intermediate description layer with strict semantics.
Case in point: I'm seeing much more success in LLM driven coding with Rust, because the strong type system prevents many invalid states that can occur in more loosely or untyped languages.
It takes longer, and often the LLM has to iterate through `cargo check` cycles to get to a state that compiles, but once it does the changes are very often correct.
The Rust community has the saying "if it compiles, it probably works". You can still have plenty of logic bugs of course , but the domain of possible mistakes is smaller.
What would be ideal is a very strict (logical) definition of application semantics that LLMs have to implement, and that ideally can be checked against the implementation. As in: have a very strict programming language with dependent types , littered with pre/post conditions, etc.
LLMs can still help to transform natural language descriptions into a formal specification, but that specification should be what drives the implementation.
There is another big difference: natural languages have ambiguity baked in. If a programming language has any ambiguity in how it can be parsed, that is rightly considered a major bug. But it's almost a feature of natural languages, allowing poetry, innuendo, and other nuanced forms of communication.
There are constructed languages that preserve the expressivity of natural human languages but without the implicit ambiguity, though; most notably, Loglan and its successor Lojban. If you read Golden Age sci-fi, Loglan sometimes shows up there specifically in this role - e.g. "Moon is a Harsh Mistress":
> By then Mike had voder-vocoder circuits supplementing his read-outs, print-outs, and decision-action boxes, and could understand not only classic programming but also Loglan and English, and could accept other languages and was doing technical translating—and reading endlessly. But in giving him instructions was safer to use Loglan. If you spoke English, results might be whimsical; multi-valued nature of English gave option circuits too much leeway.
For those unfamiliar with it, it's not that Lojban is perfectly unambiguous. It's that its design strives to ensure that ambiguity is always deliberate by making it explicit.
The obvious problem with all this is that Lojban is a very niche language with a fairly small corpus, so training AI on it is a challenge (although it's interesting to note that existing SOTA models can read and write it even so, better than many obscure human languages). However, Lojban has the nice property of being fully machine parseable - it has a PEG grammar. And, once you parse it, you can use dictionaries to construct a semantic tree of any Lojban snippet.
When it comes to LLMs, this property can be used in two ways. First, you can use structured output driven by the grammar to constrain the model to output only syntactically valid Lojban at any point. Second, you can parse the fully constructed text once it has been generated, add semantic annotations, and feed the tree back into the model to have it double-check that what it ended up writing means exactly what it wanted to mean.
With SOTA models, in fact, you don't even need the structured output - you can just give them parser as a tool and have them iterate. I did that with Claude and had it produce Lojban translations that, while not perfect, were very good. So I think that it might be possible, in principle, to generate Lojban training data out of other languages, and I can't help but wonder what would happen if you trained a model primarily on that; I suspect it would reduce hallucinations and generally improve metrics, but this is just a gut feel. Unfortunately this is a hypothesis that requires a lot of $$$ to properly test...
The nature of programming might have to shift to embrace the material property of LLM. It could become a more interpretative, social, and discovery-based activity. Maybe that's what "vibe coding" would eventually become.
> The nature of programming might have to shift to embrace the material property of LLM. It could become a more interpretative, social, and discovery-based activity. Maybe that's what "vibe coding" would eventually become
This sounds like an unmaintainable, tech debt nightmare outcome to me
C has a lot of ambiguity in how it is parsed ("undefined behavior") but people usually view that as a benefit because it allows compilers more freedom to dictate an implementation.
It's not the same. There is an explosion in expressiveness/ambiguity in the step from high-level programming languages to natural languages. This "explosion" doesn't exist in the steps between machine code and assembly, or assembly and a high-level programming language.
It is, for example, possible to formally verify or do 100% exhaustive testing as you go lower down the stack. I can't imagine this would be possible between NLs and PLs.
> The only big difference is that high level programming languages are still deterministic but natural language is not.
Arguably, determinism isn't everything in programming: It's very possible to have perfectly deterministic, yet highly surprising (in terms of actual vs. implied semantics to a human reader) code.
In other words, the axis "high/low level of abstraction" is orthogonal to the "deterministic/probabilistic" one.
Yes, but determinism is still very important in this case. It means you only need to memorize the surprising behavior once (like literally every single senior programmer has memorized their programming language's quirks even they don't want to).
Without determinism, learning becomes less rewarding.
Somehow many very smart AI entrepreneurs do not understand the concept of limits to lossless data compression. If an idea cannot be reduced further without losing information, no amount of AI is going to be able to compress it.
This is why you see so many failed startup around slack/email/jira efficiency.
Half the time you do not know if you missed critical information so you need to go to the source, negating gains you had with information that was successfully summarized.
Downloading music off the internet is just the next logical step after taping songs off the radio. Cassette tapes didn't really affect the music industry, so I wouldn't worry about this whole Napster thing.
That article articulated the reason slightly differently, arguing you need to hold multiple concepts in your head at the same time in order to develop original ideas.
Still, I'm not sure you have to remember everything, but I agree you have to remember the foundational things at the right abstraction layer, upon which you are trying to synthesize something new.
reply