Just yesterday my cursor agent made some changes to a live kubernetes cluster even over my specific instruction not to. I gave it kubectl to analyze and find the issues with a large Prometheud + AlertManager configuration, then switched windows to work on something else.
When I was back the MF was patching live resources to try and diagnose the issue.
But this is just like giving a junior engineer access to a prod K8s cluster and having them work for hours on stuff related to said cluster... you wouldn't do it. Or at least, I wouldn't do it.
In my own career, when I was a junior, I fucked up a prod database... which is why we generally don't give junior/associate people to much access to critical infra. Junior Engineers aren't "dangerous" but we just don't give them too much access/authority too soon.
Claude Code is actually way smarter than a junior engineer in my experience, but I wouldn't give it direct access to a prod database or servers, it's not needed.
what value would that provide? If we give claude code access, even though very risky, it can provide value, but what upside is to letting junior to production?
Best way to avoid this is to force the LLM to use git branches for new work. Worst case scenario you lose some cash on tokens and have to toss the branch but your prod system is left unscathed.
I thought the general point is that you can't "force" an LLM to stay within certain boundaries without placing it in an environment where it literally has no other choice.
I was diagnosing an issue in production. The idea was to have the LLM would need to collect the logs of a bunch of pods, compare the YAML code in the cluster with the templates we were feeding ArgoCD, then check why the original YAML we were feeding the cluster wasn't giving the results we expected (after several layers of templating between ArgoCD Appsets, ArgoCD Applications, Helm Charts and Prometheus Operator).
I have a cursor rule stating it should never make changes to clusters, and I have explicitly told it not to do anything behind my back.
I don't know what happened in the meantime, maybe it blew up its own context and "forgot" the basic rules, but when I got back it was running `kubectl patch` to try some changes and see if it works. Basically what a human - with the proper knowledge - would do.
Thing is: it worked. The MF found the templating issue that was breaking my Alertmanager by patching and comparing the logs. All by itself, however by going over an explicit rule I had given it a couple times.
So to summarize: it's useful as hell, but it's also dangerous as hell.
yeah claude is really eager to apply stuff directly to the cluster to the wrong context even with constant reminding that it rolls out through gitops. I think there's a way to restrict more than "kubectl" so you can allow get/describe but not apply.
Exactly. I'll need to dig deeper into its allowlist and try a few things.
Problem is: I also force it to run `kubectl --context somecontext`, as to avoid it using `kubectl config use-context` and pull a hug on me (if it switches the context and I miss it, I might then run commands against the wrong cluster by mistake). I have 60+ clusters so that's a major problem.
Then I'd need a way to allowlist `kubectl get --context`, `kubectl logs --context` and so on. A bit more painful, but hopefully a lot safer.
Just yesterday my cursor agent made some changes to a live kubernetes cluster even over my specific instruction not to. I gave it kubectl to analyze and find the issues with a large Prometheud + AlertManager configuration, then switched windows to work on something else.
When I was back the MF was patching live resources to try and diagnose the issue.