Hacker Newsnew | past | comments | ask | show | jobs | submit | Fishkins's commentslogin

Thanks for the data-based comment!

Have you noticed any change in that trend in the past year or two, or is it continuing to get better?


Np, 2 years is harder for me to tell. We need to get more of that data public and organized, and are looking at how we can do that...

We are working on some big improvements to the backend and should have some cool stuff to share later this year :)


I agree with most of this, with one important exception: you should have some form of sandboxing in place before running any local AI agent. The easiest way to do that is with .claude/settings.json[0].

This is important no matter how experienced you are, but arguable the most important when you don't know what you're doing.

0: or if you don't want to learn about that, you can use Claude Code Web


The default sandboxing works fine for me. It asks before running any command, and I can whitelist directories for reading and non-compound commands.

That's not a sandbox.

there is a real one though — https://www.anthropic.com/engineering/claude-code-sandboxing. needs to be enabled with /sandbox, not on by default.

Right, that's what I was referring to

The part about permissions with settings.json [0] is laughable. Are we really supposed to list all potential variations of harmful commands? In addition to the `Bash(cat ./.env)`, we would also need to add `Bash(cat .env)`, Bash(tail ./.env)`, Bash(tail .env)`, `Bash(head ./.env)`, `Bash(sed '' ./.env)`, and countless others... while at the same time we allow something like `npm` to run?

I know the deny list is only for automatically denying, and that non-explicitly allowed command will pause, waiting for user input confirmation. But still it reminds me of the rationale the author of the Pi harness [1] gave to explain why there will be no permission feature built-in in Pi (emphasis mine):

> If you look at the security measures in other coding agents, *they're mostly security theater*. As soon as your agent can write code and run code, it's pretty much game over. [...] If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.

As you mentioned, this is a big feature of Claude Code Web (or Codex/Antigravity or whatever equivalent of other companies): they handle the sand-boxing.

[0] https://blog.dailydoseofds.com/i/191853914/settingsjson-perm...

[1] https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to...


> The part about permissions with settings.json [0] is laughable

I never said "permissions", I said "sandboxing". You can configure that in settings.json.

https://code.claude.com/docs/en/sandboxing#configure-sandbox...


Do people really run claude and other clis like this outside a container??

Yes. I don't bother with that. I feel like the risk of Claude Code running amok is pretty low, and I don't have it do long-running tasks that exceeds my desire to monitor it. (Not because I'm worried about it breaking things, it's just I don't use the tool in that way.)

Let's not fool ourselves here. If a security feature adds any amount of friction at all, and there's a simple way to disable it, users will choose to do so.

I'm sure most folks run Claude without isolation or sandboxing. It's a terrible idea, but even most professional software developers don't think much about security.

There many decent options (cloud VMs, local VMs, Docker, the built-in sandboxing). My point is just that folks should research and set up at least one of them before running an agent.


How did you contain Claude Code? Did you virtualize it? I just set up a simple firejail script for it. Not completely sure if it's enough but it's at least something.

The official Claude Code repo is configured use a devcontainer config:

https://github.com/anthropics/claude-code

You can download the devcontainer CLI and use it to start a Docker container with a working Claude Code install, simple firewall, etc. out of the box. (I believe this is how the VSCode extension works: It uses this repo to bootstrap the devcontainer).

Basic instructions:

- Install the devcontainer CLI: `https://github.com/devcontainers/cli#install-script`

- Clone the Claude Code repo: `https://github.com/anthropics/claude-code`

- Navigate to the top-level repo directory and bring up the container: `devcontainer --workspace-folder . up`

- Start Claude in the container: `devcontainer exec --workspace-folder . bash -c "exec claude"`

P.S. It's all just Docker containers under the hood.


I‘m using https://www.docker.com/products/docker-sandboxes/

Better isolation than running it in a container.


I do something similar. I leave up and down arrows alone, but have ctrl+p and ctrl+n behave as you describe.

> humans also make mistakes

This is broadly true, but not comparable when you get into any detail. The mistakes current frontier models make are more frequent, more confident, less predictable, and much less consistent than mistakes from any human I'd work with.

IME, all of the QA measures you mention are more difficult and less reliable than understanding things properly and writing correct code from the beginning. For critical production systems, mediocre code has significant negative value to me compared to a fresh start.

There are plenty of net-positive uses for AI. Throwaway prototyping, certain boilerplate migration tasks, or anything that you can easily add automated deterministic checks for that fully covers all of the behavior you care about. Most production systems are complicated enough that those QA techniques are insufficient to determine the code has the properties you need.


> The mistakes current frontier models make are more frequent, more confident, less predictable, and much less consistent than mistakes from any human I'd work with.

my experience literal 180 degrees from this statement. and you don’t normally get the choose humans you work with, some you may be involved in the interview process but that doesn’t tell you much. I have seen so much human-written code in my career that, in the right hands, I’ll take (especially latest frontier) LLM written code over average human code any day of the week and twice on Sunday


Thus solving the problem once and for all

https://www.youtube.com/watch?v=VW66EX75jIY


This is a couple of years old now, but at one point Janelle Shane found that the only reliable way to avoid being flagged as AI was to use AI with a certain style prompt

https://www.aiweirdness.com/dont-use-ai-detectors-for-anythi...


I had the same experience as peer comments. I'm on Pixel 8 and Google Fi. When I check for updates, I'm told I'm up-to-date with the last update being over a month old.


You should see an "unvote" or "undown" link to the right of the timestamp (i.e. the opposite side from where the vote arrows were). It's fairly subtle.


Yeah, I never send a PR out without reviewing each commit myself and adding GitHub comments when I think it's relevant. Sometimes a PR is clear enough that I don't feel the need to add comments, though.


I self review but I don’t add comments I just fix the problems that I find. I should add clarifying comments.


I'd say "good old days" thinking is probably involved, but not the full explanation. Over the past few decades, software has gone from a fairly obscure profession to being seen as a great way (maybe the best way) to make a lot of money. In absolute numbers, there are probably at least as many engaged, curious engineers as before. There are almost certainly drastically more uninterested engineers who are there partially or fully because of the money, though.

edit: I hadn't scrolled down to https://news.ycombinator.com/item?id=45303388 when I wrote this


Dunno. I’ve been at this since the late ‘80’s, and have run into precious few developers who were interested in software and programming for its own sake. For most of them it was just a job.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: