Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Spin up agents to tackle routine tasks that take you out of your flow, such as codebase research, bug fixes, and backlog tasks.

The software of the future, where nobody on staff knows how anything is built, no one understands why anything breaks, and cruft multiplies exponentially.

But at least we're not taken out of our flow!



After a bunch of people leave the company it's already like nobody knows how anything is built. This seems like a good thing to accelerate understanding a codebase.


it's funny - nervous funny, not haha funny - that you think drawing a real issue like this out into the open would focus an organization on solving it.


There is a lot more money in selling a tool to manage a problem rather than solve a problem.


Agents pretty good at describing how a particular feature works. It's not as dire as you make it seem.


that's what i mean


You can ask agents to identify and remove cruft. You can ask an agent why something is breaking -- to hypothesize potential causes and test them for validity. If you don't understand how something is built, you can ask the agent to give you an overview of the architecture and then dive into whatever part you want to explore more.

And it's not like any of your criticisms don't apply to human teams. They also let cruft develop, are confused by breakages, and don't understand the code because everyone on the original team has since left for another company.


> you can ask the agent to give you an overview of the architecture and then dive into whatever part you want to explore more.

This is actually a cool use that's being explored more and more. I first saw it in the wiki thing from the devin people, and now google released one as well.


There's a cool demo from Anthropic: https://www.youtube.com/watch?v=OwMu0pyYZBc

Where they use Claude to analyse an old (demo) COBOL application.

And it understands the context of the files, decrypts the process and even draws graphs to the documentation it creates.

I wish I had this 20 years ago when I was consulting and had to jump into really funky client codebases with zero documentation and everything on fire.


Humans are just better at communicating about their process. They will spend hours talking over architectural decisions, implementation issues, writing technical details in commit messages and issue notes, and in this way they not only debug their decisions but socialize knowledge of both the code and the reasons it came to be that way. Communication and collaboration are the real adaptive skills of our species. To the extent AI can aid in those, it will be useful. To the extent it goes off and does everything in a silo, it will ultimately be ignored - much like many developers who attempt this.

I do think the primary strengths of genai are more in comprehension and troubleshooting than generating code - so far. These activities play into the collaboration and communication narrative. I would not trust an AI to clean up cruft or refactor a codebase unsupervised. Even if it did an excellent job, who would really know?


> Humans are just better at communicating about their process.

I wish that were true.

In my experience, most of the time they're not doing the things you talk about -- major architectural decisions don't get documented anywhere, commit messages give no "why", and the people who the knowledge got socialized to in unrecorded conversations then left the company.

If anything, LLM's seem to be far more consistent in documenting the rationales for design decisions, leaving clear comments in code and commit messages, etc. if you ask them to.

Unfortunately, humans generally are not better at communicating about their process, in my experience. Most engineers I know enjoy writing code, and hate documenting what they're doing. Git and issue-tracking have helped somewhat, but it's still very often about the "what" and not the "why this way".


"major architectural decisions don't get documented anywhere" "commit messages give no "why""

This is so far outside of common industry practices that I don't think your sentiment generalizes. Or perhaps your expectation of what should go in a single commit message is different from the rest of us...

LLMs, especially those with reasoning chains, are notoriously bad at explaining their thought process. This isn't vibes, it is empiricism: https://arxiv.org/abs/2305.04388

If you are genuinely working somewhere where the people around you are worse than LLMs at explaining and documenting their thought process, I would looking elsewhere. Can't imagine that is good for one's own development (or sanity).


I've worked everywhere from small startups to megacorps. The megacorps certainly do better with things like initial design documents that startups often skip entirely, but even then they're often largely out-of-date because nobody updates them. I can guarantee you that I am talking about common industry practices in consumer-facing apps.

I'm not really interested in what some academic paper has to say -- I use LLM's daily and see first-hand the quality of the documentation and explanations they produce.

I don't think there's any question that, as a general rule, LLM's do a much better job documenting what they're doing, and making it easy for people to read their code, with copious comments explaining what the code is doing and why. Engineers, on the other hand, have lots of competing priorities -- even when they want to document more, the thing needs to be shipped yesterday.


Alright, I'm glad to hear you've had a successful and rich professional career. We definitely agree that engineers generally fail to document when they have competing priorities, and that LLMs can be of use to help offload some of that work successfully.

Your initial comment made it sound like you were commenting on a genuine apples-for-apples comparisons between humans and LLMs, in a controlled setting. That's the place for empiricism, and I think dismissing studies examining such situations is a mistake.

A good warning flag for why that is a mistake is the recent article that showed engineers estimated LLMs sped them up by like 24%, but when measured they were actually slower by 17%. One should always examine whether or not the specifics of the study really applies to them--there is no "end all be all" in empiricism--but when in doubt the scientific method is our primary tool for determining what is actually going on.

But we can just vibe it lol. Fwiw, the parent comment's claims line up more with my experience than yours. Leave an agent running for "hours" (as specified in the comment) coming up with architectural choices, ask it to document all of it, and then come back and see it is a massive mess. I have yet to have a colleague do that, without reaching out and saying "help I'm out of my depth".


The paper and example you talk about seem to be about agent or plan mode (in LLM IDEs like Cursor, as those modes are called) while I and the parent are talking about ask mode, which is where the confusion seems to lie. Asking the LLM about the overall structure of an existing codebase works very well.


OK yes, you are right that we might be talking about employing AI toolings in different modes, and that the paper I am referring to is absolutely about agentic tooling executing code changes on your behalf.

That said, the first comment of the person I replied to contained: "You can ask agents to identify and remove cruft", which is pretty explicitly speaking to agent mode. He was also responding to a comment that was talking about how humans spend "hours talking about architectural decisions", which as an action mapped to AI would be more plan mode than ask mode.

Overall I definitely agree that using LLM tools to just tell you things about the structure of a codebase are a great way to use them, and that they are generally better at those one-off tasks than things that involve substantial multi-step communications in the ways humans often do.

I appreciate being the weeds here haha--hopefully we all got a little better talking abou the nuances of these things :)


Idealized industry practices that people wish to follow, but when it comes to meeting deadlines, I too have seen people eschew these practices for getting things out the door. It's a human problem, not one specific to any company.


Yes I recognize that, for various reasons, people will fail to document even when it is a profesional expectation.

I guess in this case we are comparing an idealized human to an idealized AI, given AI has equally its own failings in non-idealized scenarios (like hallucination).


Sure, you can ask the agents to "identify and remove cruft" but I never have any confidence that they actually do that reliably. Sometimes it works. Mostly they just burn tokens, in my experience.

> And it's not like any of your criticisms don't apply to human teams.

Every time the limitations of AI are discussed, we see this unfair standard applied: ideal AI output is compared to the worst human output. We get it, people suck, and sometimes the AI is better.

At least the ways that humans screw up are predictable to me. And I rarely find myself in a gaslighting session with my coworkers where I repeatedly have to tell them that they're doing it wrong, only to be met with "oh my, you're so right!" and watch them re-write the same flawed code over and over again.


Because what we need is not lazy people - we need lazy people with AI? How is this even a justification?


Sorry, where did "lazy people" come from? Nobody's talking about anybody being lazy.


I just like that evolution doesnt really care. People can opine on laziness and proper methodology. Its handwaving compared to how things shake out.

Nature does select for laziness. The laziest state that can outpace entropy in diverse ways? Ideal selection.


> The software of the future,

:chuckles nervously:


If you're building something new you'll need some skilled people around


Vibe code fixer seems to be a viable job soon


[flagged]


Do you want your software to now be stochastic instead of deterministic? That is where the analogy becomes flawed.


This argument warrants introspection for "crusty devs", but also has holes. A compiler is tightly engineered and dependable. I have never had to write assembly because I know that my compiled code 100% represents my abstract code and any functional problems are in my abstract code. That is not true in AI coding. Additionally, AI coding is not just an abstraction over code, but an abstraction over understanding. When my code compiles, I don't need to worry that the compiler misunderstood my intention.

I'm not saying AI is not a useful abstraction, but I am saying that it is not a trustworthy one.


I do still write assembly sometimes, and it's a valued skill because it'll always be important and not everyone can do it. Compilers haven't obsoleted writing assembly by hand for some use cases, and LLMs will never obsolete actually writing code either. I would be incredibly cautious about throwing all your eggs into the AI basket before you atrophy a skill that fewer and fewer will have


How is a compiler and an LLM equivalent abstractions? I'm also seriously doubtful of the 10x claim any time someone brings it up when AI is being discussed. I'm sure they can be 10x for some problems but they can also be -10x. They're not as consistently predictable (and good) like compilers are.

The "learn to master it or become obsolete" sentiment also doesn't make a lot of sense to me. Isn't the whole point of AI as a technology that people shouldn't need to spend years mastering a craft to do something well? It's literally trying to automate intelligence.


I’d worry about mastering the shift key, first.


The software of the future, where nobody on staff knows how anything is built

Doesn't this apply to people who code in high level languages?


The increasing levels of abstraction work only as long as the abstractions are deterministic (with some limited exceptions - i.e. branch prediction/preloading at CPU level, etc). You can still get into issues with leaky abstractions, but generally they are quite rare in established high->low level language transformations.

This is more akin to manager-level view of the code (who need developers to go and look at the "deterministic" instructions); the abstraction is a lot lot more leaky than high->low level languages.


In the 00s I saw so many C codebases with hand-rolled linked lists where dynamically resized arrays would be more appropriate, "should be big enough" static allocations with no real idea of how to determine that size, etc. Hardly anyone seemed to have a practical understanding of hashes. When you use a higher level language, you get steered towards the practical, fundamental data structures more or less automatically.


even JS doesn't churn as fast as the models powering vibe coding, and that cut & paste node app is still deterministic, compared to what happens when the next version of the model looks at AI-generated code from two years ago...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: