More

codyvoda · 2025-05-22T18:39:32 1747939172

counterpoint: influencers said they wiped the floor with everyone so it must have happened

sunaookami · 2025-05-22T20:15:16 1747944916

Who cares about what random influencers say?

infecto · 2025-05-23T11:33:34 1748000014

I think he is hinting at folks like you who say things like Deepseek mopping the floor when beyond some contribution to the open source community which was indeed impressive, there really has been not much of a change. No floors were mopped.

sunaookami · 2025-05-23T13:52:29 1748008349

See the other comments. There was change. Don't know what that has to do with influencers, I don't follow these people.

infecto · 2025-05-23T15:26:25 1748013985

No floors were mopped. See comment you replied to. Change happened, their research was great but no floors were mopped.

codyvoda · 2025-05-21T12:16:25 1747829785

the code goes through a PR review process like any other? what are you talking about?

fernandotakai · 2025-05-21T12:27:46 1747830466

i don't know about you, but i would never EVER submit a PR that fails to compile. not tests are failing, those happen (specially flaky ci), but not compiling.

that's literally the bare minimum.

codyvoda · 2025-05-21T12:46:58 1747831618

and you think this beta system that launched like 2 days ago can’t achieve that?

it also opens the PR as its working session. there are a lot of dials, and a lot of redditor-ass opinions from people who don’t use or understand the tech

nottorp · 2025-05-21T14:41:33 1747838493

what i see is a human telling the "AI" that the code does not compile

what use is a bot if it can't do at least this simple step?

codyvoda · 2025-05-21T14:50:14 1747839014

it can do this step. once again, this launched 2 days ago and people are using it for the first time

if you have used it for more than a few hours (or literally just read the docs) and aren’t stupid, you know this is easily solved

you’re giving into mob mentality

nottorp · 2025-05-21T15:46:46 1747842406

So everyone who doesn't worship "AI" is stupid? :)

codyvoda · 2025-05-21T16:36:44 1747845404

is that what I said? if you can’t read documentation and follow basic instructions to get a tool to work you’re stupid. you asked a snarky question like it’s some gotcha. once again, if you actually use the tool and read the docs and can’t figure it out, I think it’s a skill issue

nottorp · 2025-05-21T17:40:32 1747849232

The gotcha is you seem to consider it normal to push code that doesn't qualify even for "it works on my machine".

nottorp · 2025-05-21T14:29:20 1747837760

so do you consider normal to submit code that you have never compiled? or ran at least once if it's not a compiled language...

codyvoda · 2025-05-21T12:15:38 1747829738

this entire thread is very reddit-y

this stuff works. it takes effort and learning. it’s not going to magically solve high-complexity tasks (or even low-complexity ones) without investment. having people use it, learn how it works, and improve the systems is the right approach

a lot of armchair engineers in here

sensanaty · 2025-05-21T17:21:04 1747848064

People, specifically managers and C-levels, are being sold on this crap on the idea that it can replace people now, today as-is. Billions upon billions of dollars are being shoved in indiscriminately, toothbrushes are coming with "AI" slapped on somehow from how insane the hype bubble is.

And here we have many examples from the biggest bullshit pushers in the whole market of their state of the art tool being hilariously useless in trivial cases. These PRs are about as simple as you can get without it being a typo fix, and we're all seeing it actively bullshit and straight up contradict itself many times, just as anyone who's ever used LLMs would tell you happens all the time.

The supposed magic, omnipotent tool that is AI apparently can't even write test scaffolding without a human telling it exactly what it has to do, yet we're supposed to be excited about this crap? If I saw a PR like this at work, I'd be going straight to my manager to have whoever dared push this kind of garbage reprimanded on the spot, except not even interns are this incompetent and annoying to work with.

codyvoda · 2025-05-21T17:48:51 1747849731

it’s not magic. it can make meaningful contributions (if you actually invest in learning the tools + best practices for using them)

you’re taking an anecdote and blowing it out of proportion to fit your preformed opinion. yes, when you start with the tool and do literally no work it makes bad PRs. yes, it’s early and experimental. that doesn’t mean it doesn’t work (I have plenty of anecdotes that it does!)

the truth lies in between and the mob mentality it’s magic or complete bullshit doesn’t help. I’d love to come to a thread like this and actually hear about real experiences from smart people using these kind of tools, but instead we get this bullshit

sensanaty · 2025-05-21T19:42:07 1747856527

> ...(if you actually invest in learning the tools + best practices for using them)

So I keep being told, but after judiciously and really trying my damned hardest to make these tools work for ANYTHING other than the most trivial imaginable problems, it has been an abject failure for me and my colleagues. Below is a FAR from comprehensive list of my attempts at having AI tooling do anything useful for me that isn't the most basic boilerplate (and even then, that gets fucked up plenty often too).

- I have tried all of the editors and related tooling. Cursor, Jetbrains' AI Chat, Jetbrains' Junie, Windsurf, Continue, Cline, Aider. If it has ever been hyped here on HN, I've given it a shot because I'd also like to see what these tools can do.

- I have tried every model I reasonably can. Gemini 2.5 Pro with "Deep Research", Gemini Flash, Claude 3.7 sonnet with extended thinking, GPT o4, GPT 4.5, Grok, That Chinese One That Turned Out To Be Overhyped Too. I'm sure I haven't used the latest and greatest gpt-04.7-blowjobedition-distilled-quant-3.1415, but I'd say I've given a large number of them more than a fair shot.

- I have tried dumb chat modes (which IME still work the best somehow). The APIs rather than the UIs. Agent modes. "Architect" modes. I have given these tools free reign of my CLI to do whatever the fuck they wanted. Web search.

- I have tried giving them the most comprehensive prompts imaginable. The type of prompts that, if you were to just give it to an intern, it'd be a truly miraculous feat of idiocy to fuck it up. I have tried having different AI models generate prompts for other AI models. I have tried compressing my entire codebase with tools like Repomix. I have tried only ever doing a single back-and-forth, as well as extremely deep chat chains hundreds of messages deep. Half the time my lazy "nah that's shit do it again" type of prompts work better than the detailed ones.

- I have tried giving them instructions via JSON, TOML, YAML, Plaintext, Markdown, MDX, HTML, XML. I've tried giving them diagrams, mermaid charts, well commented code, well tested and covered code.

Time after time after time, my experiences are pretty much a 1:1 match to what we're seeing in these PRs we're discussing. Absolute wastes of time and massive failures for anything that involves literally any complexity whatsoever. I have at this point wasted several orders of magnitudes more time trying to get AIs to spit out anything usable than if I had just sat down and done things myself. Yes, they save time for some specific tasks. I love that I can give it a big ass JSON blob and tell it to extract the typedef for me and it saves me 20 minutes of very tedious work (assuming it doesn't just make random shit up from time to time, which happens ~30% of the time still). I love that if there's some unimportant script I need to cook up real quick, I can just ask it and toss it away after I'm done.

However, what I'm pissed beyond all reason about is that despite me NOT being some sort of luddite who's afraid of change or whatever insult gets thrown around, my experiences with these tools keep getting tossed aside, and I mean by people who have a direct effect on my continued employment and lack of starvation. You're doing it yourself. We are literally looking at a prime of example of the problem, from THE BIGGEST PUSHERS of this tool, with many people in this thread and the reddit thread commenting similar things to myself, and it's being thrown to the wayside as an "anecdote getting blown out of proportion".

What the fuck will it take for the AI pushers to finally stop moving the god damn goal posts and trying to spin every single failure presented to us in broad daylight as a "you're le holding it le wrong teehee" type of thing? Do we need to suffer through 20 million more slop PRs that accomplish nothing and STILL REQUIRE HUMAN HANDHOLDING before the sycophants relent a bit?

deepdarkforest · 2025-05-23T00:08:45 1747958925

just wanted to say this was the most relatable take i have read so far, and i've read a lot. Exact same experiences. And you didnt even touch on the MCP's that enable these things to go wild as well. I think our takes are not being taken seriously for 2 reasons.

First marketing gaslighting from the faangs and hot startups with grifters that managed to raise and need to keep the bullshit windmill going.

Second is that these tools are relatively the best in boilerplate nextjs code that the vibecoders use to make a very simple dashboard and stuff, and they're the noisy minority on twitter.

There is basically zero financial incentive to admit LLM's are pushed dangerously beyond their current limits. I'm still figuring a way to go short this, apart from literally shorting the market.

sponnath · 2025-05-23T20:25:42 1748031942

People see that these things generate code and due to their lack of understanding they automatically assume this is all software engineering is.

Then we have the current batch of YC execs heavily pushing "vibe coded" startups. The sad reality is that this strategy will probably work because all they need is the next incredulous business guy to buy the vibe coded startup. There's so much money in the AI space to the point where I fully believe you can likely make billions of dollars this way through acquisition (see OAI buying Windsurf for billions of dollars, likely to devalue Cursor's also absurd valuation).

I'm not a luddite. I'm a huge fan of companies spending a decent chunk of money on R&D on innovative new projects even when there's a high risk of failure. The current LLM hype is not just an R&D project anymore. This is now being pushed as a full on replacement of human labor when it's clearly not ready. And now we're valuing AI startups at billions of dollars and planning to spend $500B on AI infrastructure so that we can generate more ghibli memes.

At some point this has to stop but I'm afraid by that point the damage will already be done. Even worse, the idiots who led this exercise in massive waste will just hop onto the next hype train.

isaacremuant · 2025-05-21T20:48:23 1747860503

Literally you're in a site where we are anything but armchair. We have years of experience. You're using your ad hominems wrong. Save them for a football thread and come up with actual arguments next time.

codyvoda · 2025-05-21T12:10:38 1747829438

or press “a”

codyvoda · 2025-05-06T13:55:19 1746539719

eh terminology happens. I’d argue we shouldn’t be using “AI”, but here we are. I’d prefer encouraging “vibe coding” to mean this over fighting against the wind

lolinder · 2025-05-06T14:00:30 1746540030

But what is "this"? It's just refactoring with an LLM along for the ride! That's increasingly the default assumption when we're coding or refactoring, so the extra "vibe" attached at the front is totally meaningless if it doesn't imply that you're going so fast you're not checking the LLM's work.

croes · 2025-05-06T13:57:04 1746539824

But the moment it means two things.

One is ok, the other is risky

consp · 2025-05-06T14:01:25 1746540085

One is YMMV the other is bad.

codyvoda · 2025-05-06T13:13:40 1746537220

yeah Microsoft could never conceivably develop an extensible source available IDE people love so much they even fork to build $3B companies on the scraps of. absolutely alien!

codyvoda · 2025-05-06T12:47:24 1746535644

given that they lose >$4B/year I guess everything is peanuts

mrweasel · 2025-05-06T12:58:30 1746536310

OpenAI have $40 billion in funding from SoftBank for the next two years, so they can afford to buy Windsurf.

Is OpenAI worth the $260 billion valuation... No, of course not, they're losing >$4 billion a year.

dbbk · 2025-05-06T13:05:32 1746536732

That $40 billion is actively being spent being lit on fire to serve all the ChatGPT requests though. It's not just sat in the bank doing nothing.

codyvoda · 2025-05-05T13:19:16 1746451156

it happens on practically every topic. this was technologically not possible 100 years ago and it’ll be fascinating to watch play out. stfu and listen is a legit skill in 2025

cwmma · 2025-05-05T15:15:13 1746458113

100 years ago the term "armchair general" was already 25 years old.

bunderbunder · 2025-05-05T17:15:51 1746465351

And 200 years ago cringy dinner table conversations about complex and subtle topics was already an established literary trope.

codyvoda · 2025-05-02T05:37:06 1746164226

Eldo Kim

you stand out when you obviously hide

chii · 2025-05-02T06:36:38 1746167798

only if you are the only one doing the obfuscation.

It's why tor browser is set to a specific dimension (in terms of pixel size), have the same set of available fonts etc.

klabb3 · 2025-05-02T10:01:12 1746180072

And yet you still stand out if you use tor.

chii · 2025-05-02T10:17:15 1746181035

yes, and it's because not enough people use tor-browser (i meant the browser, not the network).

But if privacy is truly the desired goal, the regular browser ought to behave just like tor-browser.

freeamz · 2025-05-02T17:55:59 1746208559

Tor Browser safe mode. That is one of few ways to defeat that fingerprinting thing.

codyvoda · 2025-04-29T13:50:10 1745934610

is this a Claude Code alternative? seems way more GUI-focused

Codex is OSS (and Aider of course) and serve as decent alternatives

davidvgilmore · 2025-04-29T13:59:26 1745935166

It is an agentic coding tool at its core, so yes - I'd say it's fair to call it a Claude Code alternative.

Regarding the GUI focus: we did that so it's more approachable to both tech folks and people not as accustomed to using a terminal (e.g. PMs, sales engineers, etc.). But a lot of our beta users are devs.

Also, when using its agentic search and data viz capabilities, some users prefer to not do that in the terminal.

codyvoda · 2025-04-29T15:07:38 1745939258

sure I’d just say it’s a Cursor alternative :shrug:

davidvgilmore · 2025-04-29T18:12:55 1745950375