Hacker Newsnew | past | comments | ask | show | jobs | submit | lucasmullens's commentslogin

How is something "obviously AI" for being written how you, a human, used to write?


They must be AI then


Some people like me are running a company and are still picking out their tech stack. I don't like Microsoft, and that absolutely affects how likely I am to use their services. My situation might not be that common but PR surely still matters some.


Hey, so it's a bit obvious you vibe coded this, which makes me not want to trust it. Some red flags:

- The Apple icon is a literal apple and not the Apple logo.

- You've got 2 Mac download buttons that do the same thing right at the top, surely one of those is a mistake.

- "Watch it in action" is positioned poorly and fails to be a header for the video. Too close to the button above it.

- "Automatic version control" is not what a checkpoint is? "Version control" means git to almost everyone.

- Privacy link is a fake placeholder.

- "See It In Action" looks like you meant to add images and just forgot?

- You named this like 5 things. The website title is "Checkpoints for Claude Code", the domain is "Claude Checkpoints", the UI website title is just "Checkpoints" as if its a standalone brand, the contact email link uses "checkpoints-app.com", and finally you call it "Claude Diff" in the App Store description. Oh and the HN submission is a 6th one, "Claude Code Checkpoints".

Cool project though, sorry to be so critical.


It has a big banner that says "Research preview: The browser extension is a beta feature with unique risks—stay alert and protect yourself from bad actors.", and it says "Join the research preview", and then takes you to a form with another warning, "Disclaimer: This is an experimental research preview feature which has several inherent risks. Before using Claude for Chrome, read our safety guide which covers risks, permission limitations, and privacy considerations."

I would also imagine that it warns you again when you run it for the first time.

I don't disagree with you given how uniquely important these security concerns are, but they seem to be doing at least an okay job at warning people, hard to say without knowing how their in-app warnings look.


Come on, don't be mean. Imagine saying this in person to someone who just told you they got scammed. "You're just extremely gullible" is just so mean...show some empathy.


Anyone who trusts Musk enough to send him $15k doesn't deserve a bit of empathy.


I'm pretty sure the comment wasn't a joke? I saw the stream last week, it was very impressive use of AI, I didn't realize it was AI until he started talking about doubling crypto.

What about the bio is satirical? I'm pretty sure that's sincere too.


User has edited their bio now :)


I didn't edit my bio. My projects are not satire. I'm just less ashamed than most, so I work on more "exciting" projects. I've worked extensively with generative AI, including video, myself. It was just that convincing to me in the moment. My regret knows no bounds. Luckily I earn enough this doesn't devastate me, but I really could have done some good with that money.


Yikes. In that case, please accept my apology. Your bio disappeared for a while off your page, but it's back as it was now.


You're sharing your own tweet, is that allowed on HN?

Either way consider just posting the text or using a platform other than X, because without logging in I can't read it (without using the xcancel.com version someone else posted).


> But with coding models they ignore context of the codebase and the results feel more like patchwork.

Have you tried Cursor? It has a great feature that grabs context from the codebase, I use it all the time.


> It has a great feature that grabs context from the codebase, I use it all the time.

If only this feature worked consistently, or reliably even half of the time.

It will casually forget or ignore any and all context and any and all files in your codebase at random times, and you never know what set of files and docs it's working with at any point in time


I can't get the prompt because I'm on my work computer but I have about a three-quarter-page instruction set in the settings of cursor, it asks clarifying questions a LOT now, and is pretty liberal with adding in commented pseudo-code for stuff it isn't sure about. You can still trip it up if you try, but it's a lot better than stock. This is with Sonnet 3.5 agent chats (composer I think it's called?)

I actually cancelled by Anthropic subscription when I started using cursor because I only ever used Claude for code generation anyway so now I just do it within the IDE.


I'm very interested in your prompt and could you be so kind to paste it somewhere and link in your comment, please?


Also interested to see this


I have not. But I also can't get the general model to work well in even toy problems.

Here's a simple example with GPT-4o: https://0x0.st/8K3z.png

It probably isn't obvious in a quick read, but there are mistakes here. Maybe the most obvious is that how `replacements` is made we need to intelligently order. This could be fixed by sorting. But is this the right data structure? Not to mention that the algorithm itself is quite... odd

To give a more complicated example I passed the same prompt from this famous code golf problem[0]. Here's the results, I'll save you the time, the output is wrong https://0x0.st/8K3M.txt (note, I started command likes with "$" and added some notes for you)

Just for the heck of it, here's the same thing but with o1-preview

Initial problem: https://0x0.st/8K3t.txt

Codegolf one: https://0x0.st/8K3y.txt

As you can see, o1 is a bit better on the initial problem but still fails at the code golf one. It really isn't beating the baseline naive solution. It does 170 MiB/s compared to 160 MiB/s (baseline with -O3). This is something I'd hope it could do really well on given that this problem is rather famous and so many occurrences of it should show up. There's tons of variations out there and It is common to see parallel fizzbuzz in a class on parallelization as well as it can teach important concepts like keeping the output in the right order.

But hey, at least o1 has the correct output... It's just that that's not all that matters.

I stand by this: evaluating code based on output alone is akin to evaluating a mathematical proof based on the result. And I hope these examples make the point why that matters, why checking output is insufficient.

[0] https://codegolf.stackexchange.com/questions/215216/high-thr...

Edit: I want to add that there's also an important factor here. The LLM might get you a "result" faster, but you are much more likely to miss the learning process that comes with struggling. Because that makes you much faster (and more flexible) not just next time but in many situations where even a subset is similar. Which yeah, totally fine to glue shit together when you don't care and just need something, but there's a lot of missed value if you need to revisit any of that. I do have concerns that people will be plateaued at junior levels. I hope it doesn't cause seniors to revert to juniors, which I've seen happen without LLMs. If you stop working on these types of problems, you lose the skills. There's already an issue where we rush to get output and it has clear effects on the stagnation of devs. We have far more programmers than ever but I'm not confident we have a significant number more wizards (the percentage of wizards is decreasing). There's fewer people writing programs just for fun. But "for fun" is one of our greatest learning tools as humans. Play is a common trait you see in animals and it exists for a reason.


The "How" got dropped off the title, it's an article explaining what this guy wants, not an article saying Google turned Android into anything.


Ah. It's not, then, how Google plans to displace Microsoft on the desktop.


Honestly that was one of my first thoughts when I saw this, someone is going to draw one of those. People love trying to be offensive in something that barely allows communication. I remember seeing some MMORPG that had no chat where players would log in and just stand in the shape of a swastika.

Makes me wonder if it's reasonable to write an algorithm to detect that.


I built a grid-based music sequencer for the web a while ago that lets people see projects other people made in a side panel.

I very quickly realized I needed some manual content moderation because some people immediately started sharing patterns that looked like dicks and swastikas lol


Paternalism at its finest.


Which MMO was that in? I'm working on my own and now I have a new worry on my plate. Short of standard moderation, an algorithm to detect it could be interesting.


You could search for the article about how the Lego mmo (or that's what i think it was) was shut down, then stop worrying about it. Unless you want to spend the rest of your natural life on detecting penises, swastikas and everything else. I mean, everything is offensive to someone.


I realized it would be easy enough to detect when certain lines are forming and manually shuffling the players a few spots, but your comment made me smile. Somewhere in there is the start of a biography title.


Took me a while to remember the name, it was Habbo. The swastikas are mentioned on the Wikipedia article: https://en.wikipedia.org/wiki/Habbo


Here's an extra part of the spec: Deliberately there is no correct WWW browser window width. So you'll also have to account for the swastika writers using a whole range of window widths.

* https://news.ycombinator.com/item?id=40801007


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: