More

gloosx · 2025-12-09T20:10:38 1765311038

of course LLM doesn't experience or feel anything. To experience or feel something requires a subject, and LLM is just a tool, thing, an object.

It's just a statistical machine which excels at unrolling coherent sentences but it doesnt "know" what the words mean in a human-like, experienced sense. It just mimics human language patterns prioritising producing plausible-sounding, statistically likely text over factual truth, which is apparently enough to fool someone into believing it is a sentient being or something

gloosx · 2025-12-09T06:21:49 1765261309

From the creators of the 10x engineer:

0.1x engineering cost

gloosx · 2025-12-08T18:35:04 1765218904

>We are trying to fix probability with more probability. That is a losing game.

>We need to re-introduce Determinism into the stack.

>If it fails lets inject more prompts but call it "rules" and run the magic box again

Bravo.

gloosx · 2025-12-08T10:19:03 1765189143

>Outside of buying sex and drugs the only uses for cryptocoins are, and always has been, ransoms, scams and gambling

That is a very shallow take. There are real non-criminal uses for crypto that people in stable, wealthy countries often overlook. Millions rely on it simply to move money between family members across borders when traditional banking is slow, blocked, or outright inaccessible due to politics. In several countries, people use crypto to buy food, medicine, or basic goods because their local currency is collapsing or their banking system is dysfunctional, or their entire nation has been cut off from the global financial system as a decision of few politics persons.

Its fine to criticise the scams and speculation — there is plenty of that — but pretending thats the only use case ignores the people who depend on it for everyday survival.

FabHK · 2025-12-08T16:20:18 1765210818

> the people who depend on it for everyday survival

Oh, my, all those poor people that died prior to 2009.

gloosx · 2025-12-09T13:46:29 1765287989

Reducing real human struggles to a punchline is exactly the kind of cynical detachment you can afford only if you have never lived through a failed banking system. If you had lived through what people in some countries deal with, you would not be making snarky comments

The reality is that crypto became a lifeline in places where the traditional financial system collapsed or simply abandoned people: Venezuela, Argentina, Lebanon, Nigeria are good examples of people dealing with real crises, using whatever tools actually work, including crypto currencies.

AnthonyMouse · 2025-12-08T21:54:58 1765230898

Are you disputing that people in impoverished or mismanaged countries die of hunger or preventable diseases when they can't buy food or medicine?

bhickey · 2025-12-08T12:09:18 1765195758

And the Glock switch is useful for home defense.

gloosx · 2025-12-03T20:28:19 1764793699

Of course you do, in certain cases making less round-trips to the server is just straight more efficient

gloosx · 2025-12-03T10:23:02 1764757382

It is always "I'm producing 300 projects in a nanosecond" but it's almost never about sharing or actually deploying these ;)

DoctorOW · 2025-12-03T11:13:19 1764760399

The problem I had that the larger your project gets, the more mistakes Claude makes. I (not a parent commenter) started with a basic CRUD web app and was blown away by how detailed it was, new CSS, good error handling, good selection and use of libraries, it could even write the terminal commands for package management and building. As the project grew to something larger Claude started forgetting that some code already existed in the project and started repeating itself, and worse still when I asked for new features it would pick a copy at random leaving them out of sync with eachother. Moving forward I've been alternating between writing stuff with AI, then rewriting it myself.

HarHarVeryFunny · 2025-12-03T14:29:46 1764772186

> The problem I had that the larger your project gets, the more mistakes Claude makes

I think the reason for this is because these systems get all their coding and design expertise from training, and while there is lots of training data available for small scale software (individual functions, small projects), there is much less for large projects (mostly commercial and private, aside from a few large open source projects).

Designing large software systems, both to meet initial requirements, and to be maintainable and extensible over time, is a different skill than writing small software projects, which is why design of these systems is done by senior developers and systems architects. It's perhaps a bit like the difference between designing a city and designing a single building - there are different considerations and decisions being made. A city is not just a big building, or a collection of buildings, and large software system is not just a large function or collection of functions.

commakozzi · 2025-12-03T20:09:06 1764792546

Yeah, great analogy. Thanks!

chrisweekly · 2025-12-03T15:17:27 1764775047

Well said, and good analogy.

adastra22 · 2025-12-03T17:30:15 1764783015

Have it produce CLAUDE.md files in every directory giving a summary of what code is where, and a system directive to keep these updated.

delaminator · 2025-12-03T15:38:02 1764776282

I’ve got Claude on 10k loc projects, which is probably mid range.

But I also have it document and summarise its own work.

collinmanderson · 2025-12-04T16:02:33 1764864153

> I also have it document and summarise its own work.

Could you share some of your prompts or CLAUDE.md? I'm still learning what works.

cpursley · 2025-12-03T13:41:28 1764769288

So, just like a human on a growing codebase?

sahn44 · 2025-12-03T13:33:14 1764768794

Here's mine fully deployed, https://hackernewsanalyzer.com/. I use it daily and have some users. ~99.7% LLM code. About 1 hour to first working prototype then another 40 hours to get it polished and complete to current state.

gloosx · 2025-12-03T20:29:52 1764793792

It shows, quite an interesting wrapper over GPT with unauthorized access to prompting it you assembled there ;) Very much liked the part where it makes 1000 requests pulling 1000 comments from the firebase to the client and then shoots them back to GPT via supabase

take care

beepbooptheory · 2025-12-03T14:30:12 1764772212

To be clear, roughly 39.8 hours of just prompting and output to make this website?

sahn44 · 2025-12-03T16:21:47 1764778907

41 hours total of prompting, looking at code diffs, reverting, reprompting, and occasional direct code commits. I do review the full code changes nearly every step of the way and often iterate numerous times until I'm satisfied with the resulting code approach.

beepbooptheory · 2025-12-03T17:11:35 1764781895

Have you tried to go back to the old way, maybe just as an experiment, to see how much time you are actually saving? You might be a little surprised! Significant "reprompting" time to me indicates maybe a little too much relying on it rather than leading by example. Things are much faster in general if you find the right loop of maybe using Claude for like 15%-20% of stuff instead of 99.7%. You wouldn't give your junior 99.7% ownership of the app unless they were your only person, right? I find spending time thinking through certain things by hand will make you so much more productive, and the code will generally be much better quality.

I get that like 3 years ago we were all just essentially proving points building apps completely with prompts, and they make good blog subjects maybe, but in practice they end up being either fragile novelties or bloated rat's nests that end up taking more time not less.

adastra22 · 2025-12-03T17:36:08 1764783368

I’ve done things in days that in the before times would have took me months. I don’t see how you can make that time difference up.

I have at least one project where I can make that direct comparison - I spent three months writing something in the language I’ve done most of my professional career in, then as a weekend project I got ChatGPT to write it from scratch in a different language I had never used before. That was pre-agentic tools - it could probably be done in an afternoon now.

sahn44 · 2025-12-03T20:40:07 1764794407

I'm not a fulltime developer, but manage a large dev team now. So, this project is basically beyond my abilities to code myself by hand. Pre llm, I would expect in neighborhood of 1.5-2 months for a capable dev on my team to produce this and replicate all the features.

__MatrixMan__ · 2025-12-03T13:40:39 1764769239

If you haunt the pull requests of projects you use I bet you'll find there's a new species of PR:

> I'm not an expert in this language or this project but I used AI to add a feature and I think its pretty good. Do you want to use it?

I find myself writing these and bumping into others doing the same thing. It's exciting, projects that were stagnant are getting new attention.

I understand that a maintainer may not want to take responsibility for new features of this sort, but its easier than ever to fork the project and merge them yourself.

I noticed this most recently in https://github.com/andyk/ht/pulls which has two open (one draft) PRs of that sort, plus several closed ones.

Issues that have been stale for years are getting traction, and if you look at the commit messages, it's AI tooling doing the work.

People feel more capable to attempt contributions which they'd otherwise have to wait for a specialist for. We do need to be careful not to overwhelm the specialists with such things, as some of them are of low quality, but on the whole it's a really good thing.

If you're not noticing it, I suggests hanging out in places where people actually share code, rather than here where we often instead brag about unshared code.

filoeleven · 2025-12-03T16:24:50 1764779090

> People feel more capable to attempt contributions

That does not mean that they are more capable, and that's the problem.

> We do need to be careful not to overwhelm the specialists with such things, as some of them are of low quality, but on the whole it's a really good thing.

That's not what the specialists who have to deal with this slop say. There have been articles about this discussed here already.

__MatrixMan__ · 2025-12-04T00:24:31 1764807871

What would you have us do, Keep the fixes to ourselves?

eschaton · 2025-12-05T04:01:27 1764907287

Yes, keep AI slop “fixes” to yourself and only create PRs for your own work.

freehorse · 2025-12-03T11:23:12 1764760992

At this point my prior is that all these 300/ns projects are some kind of internal tools, with very narrow scope and many just for a one-off use.

Which is also fine and great and very useful and I am also making those, but it probably does not generalize to projects that require higher quality standards and actual maintenance.

delaminator · 2025-12-03T15:41:06 1764776466

Sure, but 80% of software is probably internal and short lived like that, if not more.

Solving the problems of the business that isn’t a software business.

everforward · 2025-12-04T00:38:00 1764808680

Places that aren't software businesses are usually the inverse. The software is extremely sticky and will be around for ages, and will also bloat to 4x the features it was originally supposed to have.

I worked at an insurance company a decade ago and the majority of their software was ancient. There were a couple desktops in the datacenter lab running Windows NT for something that had never been ported. They'd spent the past decade trying to get off the mainframe and a majority of requests still hit the mainframe at some point. We kept versions of Java and IBM WebSphere on NFS shares because Oracle or IBM (or both) wouldn't even let us download versions that old and insecure.

Software businesses are way more willing to continually rebuild an app every year.

blazingbanana · 2025-12-03T12:00:59 1764763259

I also see a lot of this so I can't blame you for thinking it! See my other post about some projects build _only_ using LLMs.

https://news.ycombinator.com/item?id=46133458

staticassertion · 2025-12-03T13:35:28 1764768928

There's a massive incentive not to share them. If I wrote a project using AI I'd be reluctant to publish it at all because of the backlash I've seen people get for it.

aenis · 2025-12-03T16:36:11 1764779771

People are and always were reluctant to share their own code just the same. There is nothing to be gained, the chances of getting positive reviews from fellow engineers are slim to none. We are a critical and somewhat hypocritical bunch on average.

JackSlateur · 2025-12-04T12:05:30 1764849930

Building something is easy

Building something that works ? Not so easy

Pushing that thing in production ? That the hardest part

delaminator · 2025-12-04T07:28:12 1764833292

I came with receipts

gloosx · 2025-12-03T07:00:36 1764745236

Try to ask it to write some GLSL shaders. Just describe what you want to see and then try to run the shaders it outputs. It can output a UV-map or the simple gradient right, but when it comes to shaders a bit more complex it most of the time will not compile or run properly, sometimes mix GLSL versions, sometimes just straight make up things which don't work or output what you want.

gloosx · 2025-12-01T07:20:24 1764573624

The abstract actually says 12%

gloosx · 2025-12-01T07:16:48 1764573408

It's not as simple as putting all programmers into one category. There can be oversupply of web developers but at the same time undersupply of COBOL developers. If you are a very good developer, you will always be in demand.

ben_w · 2025-12-01T08:25:41 1764577541

> If you are a very good developer, you will always be in demand.

"Always", in the same way that five years ago we'd "never" have an AI that can do a code review.

Don't get me wrong: I've watched a decade of promises that "self driving cars are coming real soon now honest", latest news about Tesla's is that it can't cope with leaves; I certainly *hope* that a decade from now will still be having much the same conversation about AI taking senior programmer jobs, but "always" is a long time.

nradov · 2025-12-01T16:02:10 1764604930

Five years ago we had pretty good static analysis tools for popular languages which could automate certain aspects of code reviews and catch many common defects. Those tools didn't even use AI, just deterministic pattern matching. And yet due to laziness and incompetence many developers didn't even bother taking full advantage of those tools to maximize their own productivity.

ben_w · 2025-12-01T18:56:20 1764615380

The devs themselves can still be lazy, claude and copilot code review can be automated on all pull requests by demand of the PM — and the PM can be lazy and ask the LLMs to integrate themselves.

And the LLMs can use the static analysis tools.

lmm · 2025-12-03T02:31:43 1764729103

An LLM can run the static analysis tool and copy/paste its output onto your PR, sure. I'm not sure I would call that "doing code review".

ben_w · 2025-12-03T10:31:18 1764757878

> copy/paste

I did not say that.

That it can *also* use tools to help, doesn't mean it can *only* get there by using tools.

They can *also* just do a code review themselves.

As in, I cloned a repo of some of my old manually-written code, cd'd into it, ran `claude`, and gave it the prompt "code review" (or something close to that), and it told me a whole bunch of things wrong with it, in natural language, even though I didn't have the relevant static analysis tools for those languages installed.

lmm · 2025-12-05T04:14:23 1764908063

> I cloned a repo of some of my old manually-written code, cd'd into it, ran `claude`, and gave it the prompt "code review" (or something close to that), and it told me a whole bunch of things wrong with it, in natural language, even though I didn't have the relevant static analysis tools for those languages installed.

Well sure, but was the result any better than that of installing and running the tools? If the AI can provide better or at least different (but accurate!) PR feedback from conventional tools, that's interesting. If it's just offering the same thing (which is not really "code review" as I'd define it, even if it is something that code reviewers in some contexts spend some of their time on) through a different interface, that's much less interesting.

lisbbb · 2025-12-02T06:32:32 1764657152

I can't even imagine what time wasting bs the LLMs are finding with static analysis tools! It's all just a circle jerk everywhere now.

lisbbb · 2025-12-02T06:31:19 1764657079

Static analysis was pretty limited imho. It wasn't finding anything that interesting. I spent untold hours trying to satisfy SonarQube in 2021 & 2022. It was total shit busy work they stuck me with because all our APIs had to have at least 80% code coverage and meet a moving target of code analysis profiles that were updated quarterly. I had to do a ton of refactoring on a lot of projects just to make them testable. I barely found any bugs and after working on over 100 of those stupid things, I was basically done with that company and its bs. What an utter waste of time for a senior dev. They had to have been trying to get me to quit.

gloosx · 2025-12-02T06:42:46 1764657766

Even if someday we get AI that can generalize well, the need for a person who actually develops things using AI is not going anywhere. The thing with AI is that you cannot make it responsible, there will still be a human in the loop who is responsible for conveying ideas to the AI and controlling its results, and that person will be the developer. Senior developers are not hired just because they are smart or can write code or build systems, they are also hired to share the load of responsibility.

Someone with a name, an employment contract, and accountability is needed to sign off on decisions. Tools can be infinitely smart, but they cannot be responsible, so AI will shift how developers work, not whether they are needed.

ben_w · 2025-12-02T08:07:37 1764662857

Even where a human in the loop is a legal obligation, it can be QA or a PM, roles as different from "developer" as "developer" is from "circuit designer".

gloosx · 2025-12-02T09:03:42 1764666222

A PM or QA can sign off only on process or outcome quality. They cannot replace the person who actually understands the architecture and the implications of technical decisions. Responsibility is about being able to judge whether the system is correct, safe, maintainable, and aligned with real-world constraints.

If AI becomes powerful enough to generate entire systems, the person supervising and validating those systems is, functionally, a developer — because they must understand the technical details well enough to take responsibility for them.

Titles can shift, but the role dont disappear. Someone with deep technical judgment will still be required to translate intent into implementation and to sign off on the risks. You can call that person "developer", "AI engineer" or something else, but the core responsibility remains technical. PMs and QA do not fill that gap.

ben_w · 2025-12-02T13:23:41 1764681821

> They cannot replace the person who actually understands the architecture and the implications of technical decisions.

LLMs can already do that.

What they can't do is be legally responsible, which is a different thing.

> Responsibility is about being able to judge whether the system is correct, safe, maintainable, and aligned with real-world constraints.

Legal responsibility and technical responsibility are not always the same thing; technical responsibility is absolutely in the domain of PM and QA, legal responsibility ultimately stops with either a certified engineer (which software engineering famously isn't), the C-suite, the public liability insurance company, or the shareholders depending on specifics.

Ownership requires legal personhood, which isn't the same thing as philosophical personhood, which is why corporations themselves can be legal owners.

gloosx · 2025-12-02T20:41:58 1764708118

If LLMs truly "understood architecture" in the engineering sense, they would not hallucinate, contradict themselves, or miss edge cases that even a mid-level engineer catches instinctively.

They are powerful tools but they are not engineers.

And its not about legal responsibility at all. Developers dont go to jail for mistakes, but they are responsible within the engineering hierarchy. A pilot is not legally liable for Boeing's corporate decisions, and the plane can mostly fly on the autopilot, but you still need a pilot in the cockpit.

AI cannot replace the human whose technical judgment is required to supervise, validate, and interpret AI-generated systems.

marcelr · 2025-12-01T14:29:01 1764599341

ai can do code review? do people actually believe this? we have a mr llm bot, it is wrong 95% of the time

ben_w · 2025-12-01T19:05:29 1764615929

I have used it for code review.

Like everything else they do, it's amazing how far you can get even if you're incredibly lazy and let it do everything itself, though of course that's a bad idea because it's got all the skill and quality of result you'd expect if I said "endless hoarde of fresh grads unwilling to say 'no' except on ethical grounds".

pb7 · 2025-12-01T14:39:23 1764599963

I've been taking self-driving cars to get around regularly for a year or more.

tonyhart7 · 2025-12-01T08:43:16 1764578596

waymo and tesla already operate in certain areas, even if tech is ready

regulation still very much a thing

afavour · 2025-12-01T14:29:31 1764599371

“certain areas” is a very important qualifier, though. Typically areas with very predictable weather. Not discounting the achievement just noting that we’re still far away from ubiquity.

duderific · 2025-12-01T21:15:20 1764623720

Waymo is doing very well around San Francisco, which is certainly very challenging city driving. Yes, it doesn't snow there. Maybe areas with winter storms will never have autonomous vehicles. That doesn't mean there isn't a lot of utility created even now.

ben_w · 2025-12-02T08:17:30 1764663450

My original point, clearly badly phrased given the responses I got, is that the promises have been exceeding the reality for a decade.

Musk's claims about what Tesla's would be able to do wasn't limited to just "a few locations" it was "complete autonomy" and "you'll be able to summon your car from across the country"… by 2018.

And yet, 2025, leaves: https://news.ycombinator.com/item?id=46095867

ben_w · 2025-12-01T09:58:59 1764583139

Leaves: https://news.ycombinator.com/item?id=46095867

gloosx · 2025-11-27T19:07:58 1764270478

If unit tests are boring chores for you, or 100% coverage is somehow a goal in itself, then your understanding of quality software development is quite lacking overall. Tests are specifications: they define behavior, set boundaries, and keep the inevitable growth of complexity under control. Good tests are what keep a competent developer sane. You cannot build quality software without starting from tests. So if tests are boring you, the problem is your approach to engineering. Mature developers dont get bored chasing 100% coverage – they focus on meaningful tests that actually describe how the program is supposed to work.

rootlocus · 2025-11-27T21:24:15 1764278655

> Tests are specifications: they define behavior, set boundaries, and keep the inevitable growth of complexity under control.

I set boundaries during design where I choose responsibilities, interfaces and names. Red Green Refactor is very useful for beginners who would otherwise define boundaries that are difficult to test and maintain.

I design components that are small and focused so their APIs are simple and unit tests are incredibly easy to define and implement, usually parametrized. Unit tests don't keep me "sane", they keep me sleeping well at night because designing doesn't drive me mad. They don't define how the "program" is supposed to work, they define how the unit is supposed to work. The smaller the unit the simpler the test. I hope you agree: simple is better than complex. And no, I don't subscribe to "you only need integration tests".

Otherwise, nice battery of ad hominems you managed to slip in: my understanding of quality software is lacking, my problem is my approach to engineering and I'm an immature developer. All that from "LLMs can automate most of the boring stuff, including unit tests with 100% coverage." because you can't fathom how someone can design quality software without TDD, and you can't steelman my argument (even though it's recommended in the guidelines [1]). I do review and correct the LLM output. I almost always ask it for specific test cases to be implemented. I also enjoy seeing most basic test cases and most edge cases covered. And no, I don't particularly enjoy writing factories, setups, and asserts. I'm pretty happy to review them.

[1] https://news.ycombinator.com/newsguidelines.html Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.