Hacker Newsnew | past | comments | ask | show | jobs | submit | codyvoda's commentslogin

counterpoint: influencers said they wiped the floor with everyone so it must have happened


Who cares about what random influencers say?


I think he is hinting at folks like you who say things like Deepseek mopping the floor when beyond some contribution to the open source community which was indeed impressive, there really has been not much of a change. No floors were mopped.


See the other comments. There was change. Don't know what that has to do with influencers, I don't follow these people.


No floors were mopped. See comment you replied to. Change happened, their research was great but no floors were mopped.


the code goes through a PR review process like any other? what are you talking about?


i don't know about you, but i would never EVER submit a PR that fails to compile. not tests are failing, those happen (specially flaky ci), but not compiling.

that's literally the bare minimum.


and you think this beta system that launched like 2 days ago can’t achieve that?

it also opens the PR as its working session. there are a lot of dials, and a lot of redditor-ass opinions from people who don’t use or understand the tech


what i see is a human telling the "AI" that the code does not compile

what use is a bot if it can't do at least this simple step?


it can do this step. once again, this launched 2 days ago and people are using it for the first time

if you have used it for more than a few hours (or literally just read the docs) and aren’t stupid, you know this is easily solved

you’re giving into mob mentality


So everyone who doesn't worship "AI" is stupid? :)


is that what I said? if you can’t read documentation and follow basic instructions to get a tool to work you’re stupid. you asked a snarky question like it’s some gotcha. once again, if you actually use the tool and read the docs and can’t figure it out, I think it’s a skill issue


The gotcha is you seem to consider it normal to push code that doesn't qualify even for "it works on my machine".


so do you consider normal to submit code that you have never compiled? or ran at least once if it's not a compiled language...


this entire thread is very reddit-y

this stuff works. it takes effort and learning. it’s not going to magically solve high-complexity tasks (or even low-complexity ones) without investment. having people use it, learn how it works, and improve the systems is the right approach

a lot of armchair engineers in here


People, specifically managers and C-levels, are being sold on this crap on the idea that it can replace people now, today as-is. Billions upon billions of dollars are being shoved in indiscriminately, toothbrushes are coming with "AI" slapped on somehow from how insane the hype bubble is.

And here we have many examples from the biggest bullshit pushers in the whole market of their state of the art tool being hilariously useless in trivial cases. These PRs are about as simple as you can get without it being a typo fix, and we're all seeing it actively bullshit and straight up contradict itself many times, just as anyone who's ever used LLMs would tell you happens all the time.

The supposed magic, omnipotent tool that is AI apparently can't even write test scaffolding without a human telling it exactly what it has to do, yet we're supposed to be excited about this crap? If I saw a PR like this at work, I'd be going straight to my manager to have whoever dared push this kind of garbage reprimanded on the spot, except not even interns are this incompetent and annoying to work with.


it’s not magic. it can make meaningful contributions (if you actually invest in learning the tools + best practices for using them)

you’re taking an anecdote and blowing it out of proportion to fit your preformed opinion. yes, when you start with the tool and do literally no work it makes bad PRs. yes, it’s early and experimental. that doesn’t mean it doesn’t work (I have plenty of anecdotes that it does!)

the truth lies in between and the mob mentality it’s magic or complete bullshit doesn’t help. I’d love to come to a thread like this and actually hear about real experiences from smart people using these kind of tools, but instead we get this bullshit


> ...(if you actually invest in learning the tools + best practices for using them)

So I keep being told, but after judiciously and really trying my damned hardest to make these tools work for ANYTHING other than the most trivial imaginable problems, it has been an abject failure for me and my colleagues. Below is a FAR from comprehensive list of my attempts at having AI tooling do anything useful for me that isn't the most basic boilerplate (and even then, that gets fucked up plenty often too).

- I have tried all of the editors and related tooling. Cursor, Jetbrains' AI Chat, Jetbrains' Junie, Windsurf, Continue, Cline, Aider. If it has ever been hyped here on HN, I've given it a shot because I'd also like to see what these tools can do.

- I have tried every model I reasonably can. Gemini 2.5 Pro with "Deep Research", Gemini Flash, Claude 3.7 sonnet with extended thinking, GPT o4, GPT 4.5, Grok, That Chinese One That Turned Out To Be Overhyped Too. I'm sure I haven't used the latest and greatest gpt-04.7-blowjobedition-distilled-quant-3.1415, but I'd say I've given a large number of them more than a fair shot.

- I have tried dumb chat modes (which IME still work the best somehow). The APIs rather than the UIs. Agent modes. "Architect" modes. I have given these tools free reign of my CLI to do whatever the fuck they wanted. Web search.

- I have tried giving them the most comprehensive prompts imaginable. The type of prompts that, if you were to just give it to an intern, it'd be a truly miraculous feat of idiocy to fuck it up. I have tried having different AI models generate prompts for other AI models. I have tried compressing my entire codebase with tools like Repomix. I have tried only ever doing a single back-and-forth, as well as extremely deep chat chains hundreds of messages deep. Half the time my lazy "nah that's shit do it again" type of prompts work better than the detailed ones.

- I have tried giving them instructions via JSON, TOML, YAML, Plaintext, Markdown, MDX, HTML, XML. I've tried giving them diagrams, mermaid charts, well commented code, well tested and covered code.

Time after time after time, my experiences are pretty much a 1:1 match to what we're seeing in these PRs we're discussing. Absolute wastes of time and massive failures for anything that involves literally any complexity whatsoever. I have at this point wasted several orders of magnitudes more time trying to get AIs to spit out anything usable than if I had just sat down and done things myself. Yes, they save time for some specific tasks. I love that I can give it a big ass JSON blob and tell it to extract the typedef for me and it saves me 20 minutes of very tedious work (assuming it doesn't just make random shit up from time to time, which happens ~30% of the time still). I love that if there's some unimportant script I need to cook up real quick, I can just ask it and toss it away after I'm done.

However, what I'm pissed beyond all reason about is that despite me NOT being some sort of luddite who's afraid of change or whatever insult gets thrown around, my experiences with these tools keep getting tossed aside, and I mean by people who have a direct effect on my continued employment and lack of starvation. You're doing it yourself. We are literally looking at a prime of example of the problem, from THE BIGGEST PUSHERS of this tool, with many people in this thread and the reddit thread commenting similar things to myself, and it's being thrown to the wayside as an "anecdote getting blown out of proportion".

What the fuck will it take for the AI pushers to finally stop moving the god damn goal posts and trying to spin every single failure presented to us in broad daylight as a "you're le holding it le wrong teehee" type of thing? Do we need to suffer through 20 million more slop PRs that accomplish nothing and STILL REQUIRE HUMAN HANDHOLDING before the sycophants relent a bit?


just wanted to say this was the most relatable take i have read so far, and i've read a lot. Exact same experiences. And you didnt even touch on the MCP's that enable these things to go wild as well. I think our takes are not being taken seriously for 2 reasons.

First marketing gaslighting from the faangs and hot startups with grifters that managed to raise and need to keep the bullshit windmill going.

Second is that these tools are relatively the best in boilerplate nextjs code that the vibecoders use to make a very simple dashboard and stuff, and they're the noisy minority on twitter.

There is basically zero financial incentive to admit LLM's are pushed dangerously beyond their current limits. I'm still figuring a way to go short this, apart from literally shorting the market.


People see that these things generate code and due to their lack of understanding they automatically assume this is all software engineering is.

Then we have the current batch of YC execs heavily pushing "vibe coded" startups. The sad reality is that this strategy will probably work because all they need is the next incredulous business guy to buy the vibe coded startup. There's so much money in the AI space to the point where I fully believe you can likely make billions of dollars this way through acquisition (see OAI buying Windsurf for billions of dollars, likely to devalue Cursor's also absurd valuation).

I'm not a luddite. I'm a huge fan of companies spending a decent chunk of money on R&D on innovative new projects even when there's a high risk of failure. The current LLM hype is not just an R&D project anymore. This is now being pushed as a full on replacement of human labor when it's clearly not ready. And now we're valuing AI startups at billions of dollars and planning to spend $500B on AI infrastructure so that we can generate more ghibli memes.

At some point this has to stop but I'm afraid by that point the damage will already be done. Even worse, the idiots who led this exercise in massive waste will just hop onto the next hype train.


Literally you're in a site where we are anything but armchair. We have years of experience. You're using your ad hominems wrong. Save them for a football thread and come up with actual arguments next time.


or press “a”


eh terminology happens. I’d argue we shouldn’t be using “AI”, but here we are. I’d prefer encouraging “vibe coding” to mean this over fighting against the wind


But what is "this"? It's just refactoring with an LLM along for the ride! That's increasingly the default assumption when we're coding or refactoring, so the extra "vibe" attached at the front is totally meaningless if it doesn't imply that you're going so fast you're not checking the LLM's work.


But the moment it means two things.

One is ok, the other is risky


One is YMMV the other is bad.


yeah Microsoft could never conceivably develop an extensible source available IDE people love so much they even fork to build $3B companies on the scraps of. absolutely alien!


given that they lose >$4B/year I guess everything is peanuts


OpenAI have $40 billion in funding from SoftBank for the next two years, so they can afford to buy Windsurf.

Is OpenAI worth the $260 billion valuation... No, of course not, they're losing >$4 billion a year.


That $40 billion is actively being spent being lit on fire to serve all the ChatGPT requests though. It's not just sat in the bank doing nothing.


it happens on practically every topic. this was technologically not possible 100 years ago and it’ll be fascinating to watch play out. stfu and listen is a legit skill in 2025


100 years ago the term "armchair general" was already 25 years old.


And 200 years ago cringy dinner table conversations about complex and subtle topics was already an established literary trope.


Eldo Kim

you stand out when you obviously hide


only if you are the only one doing the obfuscation.

It's why tor browser is set to a specific dimension (in terms of pixel size), have the same set of available fonts etc.


And yet you still stand out if you use tor.


yes, and it's because not enough people use tor-browser (i meant the browser, not the network).

But if privacy is truly the desired goal, the regular browser ought to behave just like tor-browser.


Tor Browser safe mode. That is one of few ways to defeat that fingerprinting thing.


is this a Claude Code alternative? seems way more GUI-focused

Codex is OSS (and Aider of course) and serve as decent alternatives


It is an agentic coding tool at its core, so yes - I'd say it's fair to call it a Claude Code alternative.

Regarding the GUI focus: we did that so it's more approachable to both tech folks and people not as accustomed to using a terminal (e.g. PMs, sales engineers, etc.). But a lot of our beta users are devs.

Also, when using its agentic search and data viz capabilities, some users prefer to not do that in the terminal.


sure I’d just say it’s a Cursor alternative :shrug:


Fair


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: