> No, your security failure is that you use a package manager that allows third-parties push arbitrary code into your product with no oversight.
Could you explain how you'd design a package manager that does not allow that? As far as I understand the moment you use third party code you have to trust to some extent the code that you will run.
NPM setup similar dl_files_security_sigs.db .database for all downloaded files from npm in all offline install? List all versions, latest mod date, multiple latest crypto signatures (shar256, etc) and have been reviewed by multiple security org/researchers, auto flag if any contents are not pure clear/clean txt...
If it detects anything (file date, size, crypto sigs) < N days and have not been thru M="enough" security reviews, the npm system will automatically raise a security flag and stop the install and auto trigger security review on those files.
With proper (default secure) setup, any new version of npm downloads (code, config, scripts) will auto trigger stop download and flagged for global security review by multiple folks/orgs.
When/if this setup available as NPM default, would it stop similar compromise from happen to NPM again? Can anyone think of anyway to hack around this?
I'm speaking to the concept of automatic updates in general, which package managers either enable by default or implicitly allow through lack of security measures.
One obvious solution is to host your own repositories so that nothing gets updated without having been signed off by a trusted employee. Another is to check the cryptographic hash of all packages so it cannot change without the knowledge and consent of your employees.
You're right in that this does not completely eliminate the possibility of trojan horses being sneaked in through open-source dependencies but it would at the very least require some degree of finesse on the part of the person making the trojan horse so that they have to manipulate the system into doing something it was not designed to do.
One thing I really hate about the modern cybersecurity obsession is that there's a large contingent of people who aggressively advocate against anything which might present a problem if misused (rust, encryption on everything no matter how inconsequential, deprecating FTP, UEFI secure boot, timing side-channels, etc) yet at the same time there's a massive community of high-level software developers who appear to be under the impression that extremely basic vulnerabilities (trojan package managers, cross-site scripting, letting my cell phone provider steal my identity because my entire life is authenticated by a SIM card, literally just concatenating strings received over the internet into an SQL statement, etc) are unsolved problems which just has to be tolerated for now until somebody figures out a way to not download and execute non-vetted third-party code. Somehow the two groups never seem to cross swords.
TL;DR: Reading HN i feel like im constantly getting criticized for using C because I might fuck up and let a ROP through yet so many of the most severe modern security breaches are coming from people who think turning off automatic updates is like being asked to prove the rieman zeta hypothesis.
They can't explain, it's just victim blaming. The market currently doesn’t have a proper solution to this.
Everyone works with these package managers, I bet the commenter also has installed pip or npm packages without reading its full code, it just feels cool to tell other people they are dumb and it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same. Some just are unlucky.
The whole ecosystem is broken, the expectations of trust are not compatible with the current amount of attacks.
It isn't victim blaming. People like you make it impossible to avoid attacks like these because you have no appetite for a better security model.
I run npm under bubblewrap because npm has a culture of high risk; of using too many dependencies from untrusted authors. But being scrupulous and responsible is a cost I pay with my time and attention. But it is important because if I run some untrusted code and am compromised it can affect others.
But that is challenging when every time some exploit rolls around people, like you, brush it off as "unlucky". As if to say it's inavoidable. That nobody can be expected to be responsible for the libraries they use because that is too hard or whatever. You simply lack the appetite for good hygene and it makes it harder for the minority of us who care about how our actions affect others.
> you have no appetite for a better security model
For what it's worth, there are some advancements. PNPM - the packager used in this case - doesn't automatically run postinstall scripts. In this case, either the engineer allowed it explicitly, or a transitive dependency was previously considered safe, and allowed by default, but stopped being safe.
PNPM also lets you specify a minimum package age, so you cannot install packages younger than X.
The combination of these would stop most attacks, but becomes less effective if everyone specifies a minimum package age, so no one would fall victim.
It's a bit grotesque because the system relies on either the package author noticing on time, or someone falling victim and reporting it.
NPM now supports publishing signed packages, and PNPM has a trustPolicy flag. This is a step in a good direction, but is still not enough, because it relies on publishers to know and care about signing packages, and it relies on consumers to require it.
There _is_ appetite for a better security model, but a lot of old, ubiquitous packages, are unmaintained and won't adopt it. The ecosystem is evolving, but very slowly, and breaking changes seem needed.
I had the chance to finish reading and it looks like Trigger were using an older version of PNPM which didn't do any of the above, and have since implemented everything I've mentioned in my post, plus some additional Git security.
So a slight amendment there on the human error side of things.
At some point you must be open to being compelled to read code you run or ship. Otherwise, if that's to hard, then I don't know what to tell you. We'll just never agree.
If you find a better solution than being responsible for what you do and who you trust, I'm all for it. Until then, that's part of the job.
When I was a junior, our company payed a commercial license for some of the larger libraries we used and it included support. Or manage risk by using fewer and more trustworthy projects like Django instead of reaching for a new dependency from some random person every time you need to solve a simple problem.
> What no appetite? I just don't like your solution.
When I say "appetite" I am being very deliberate. You are hungry but you won't eat your vegetables. When you say "I just don't like your vegetables", then you aren't that hungry. You don't have the appetite. You'd rather accept the risk. Which is fine but then don't complain when stuff like this happens and everyone is compromised.
I hope you've read every diff to every Linux kernel you've ever deployed... There's LOADS of code you've deployed I can bet a large amount of money you never read. So clearly there's solutions that solve the problem of having to read every line of every dependency you deploy. It's just that certain ecosystems are more easy to exploit so new solutions are needed. Read everything is not a solution, it's a bandaid that shows there's a problem of trust to be solved (or improved enough to discourage this wave of attacks) with a technical solution.
No, you are the problem because you have a higher expectation than reality. People shouldn't have to run npm in containers. You're over simplifying with one case where you have found one solution while ignoring the identical problems elsewhere. You are preventing us from looking at other solutions because you think the one you have is enough and works for everyone.
I agree with you that I shouldn't have to treat my libraries like untrusted code. I don't know what the rest of your comment means. I don't see how I'm preventing anybody from looking at other solutions to npm, they just don't want to do it because it's hard. And I have similar criticisms for cargo as it just copies npm and inherits all of its problems. I hate that.
npm has had a bad ecosystem since its inception. The left-pad thing being some of my earliest memories of it [1]. So none of this is new.
But all of this is still an issue because it's too convenient and that's the most important thing. Even cargo copies npm because they want to be seen as convenient and the risk is acknowledged. Nobody has the appetite to be held accountable for who they put their trust in.
> snickerbockers > No, your security failure is that you use a package manager
> you > It isn't victim blaming. People like you make it impossible to avoid attacks like these because you have no appetite for a better security model.
I'd wager a large portion of people with `npm` don't actually realize they have `npm`. I'd also wager that most people that know they have `npm` aren't aware of the security issues.
Under those conditions, people are not in fact making choices. These are not people "that have no appetite for a better security model". These are people who don't even know they are unsafe!
Yes, this is victim blaming. Just in the same way people blame a rape victim for what they wear. Does what you wear modify the situation? Yes. Does it cause the situation? No. We only really blame a victim if they are putting themselves directly, and knowingly, in harms way. This is not that case! This is a case where people are uninformed, both in the dangers present as well as the existence of danger.
FFS, on more than one occasion I've installed a package only to see that it bundles `npm` along with it. And I'm more diligent than most people, so I know tons of people don't know it's happening. Especially because you can't always run `which npm` to find if it is installed. But the fact is that you can do something like `brew install foo` and foo has a dependency that has a dependency that has node as a dependency.
Dependency hell is integral to the problem here! So you can go ahead and choose a package manager that doesn't allow 3rd parties to push arbitrary code and end up with a package manager that allows 3rd parties to push arbitrary code! That's even what made left-pad a thing (and don't get me started on the absurdity of using a module for this functionality!).
> Nobody has the appetite to be held accountable for who they put their trust in
That is jut not the reality of things. In the real world nobody can read all the lines of code. It just simply isn't possible. You aren't reading everything that you're running, let alone all the dependencies and all the way down to the fucking kernel. There just isn't enough time in the day to do this within your lifetime, even if you are running a very cut down system. There's just too many lines of code!
So stop this bullshit rhetoric of "know what you're running" because it is ignoring the reality of the situation. Yes, people should do due diligence and inspect, but the reality is that this is not possible to do. Nor is it bulletproof, as it requires the reader to be omniscient themselves, or at least a security expert with years of training to even be able to spot security mistakes. Hell, if everyone (or just programmers) already had that kind of training then I'd wager 90+% of issues wouldn't even exist in the code in the first place.
So stop oversimplifying the situation because we can't even begin to talk about what needs to be done to solve things if we can't even discuss the reality of the problem.
>it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same.
But like, isn't that actually the core of the problem? People choose to blindly trust some random 3rd parties - isn't exploiting this trust seems to be inevitable and predictable outcome?
Victim-blaming is when a girl gets raped and you tell her that it's her fault for dressing like a skank and getting drunk at a college fraternity party. Telling the bank they should have put the money in a vault instead of leaving it in an unlocked drawer next to the cash register is not victim-blaming. Telling the CIA that they shouldn't have given Osama Bin-Laden guns and money to fight the soviets in afghanistan is not victim-blaming. Telling president Roosevelt it was a poor decision to park the entire Pacific fleet in a poorly-defended naval base adjacent to an expansionist empire which is already at war with most of America's allies is not victim-blaming. *Telling a well-funded corporation to not download and execute third-party code with privileges is not victim blaming, especially as their customers are often the ones who are actually being targeted.*
>I bet the commenter also has installed pip or npm packages without reading its full code
I think i did use pip at some point about a decade ago but i can't remember what for. In general though you lose that bet because I don't use either of these programs.
> it just feels cool to tell other people they are dumb
it does, yes.
>and it's their own fault for not reading all the code beforehand or for using a package manager, when every single person does the same.
I don't suppose you've ever played an old video game called "Lemmings"?
>Some just are unlucky.
Lol.
>The whole ecosystem is broken, the expectations of trust are not compatible with the current amount of attacks.
that's kind of my point, except it doesn't mitigate responsibility for participating in that ecosystem.
I think what's most likely to happen here is that:
* a developer that knew how it worked used it in code where he *wanted* to get the first line
* someone just starting up copied it over and assumed that's the way to get the content of command into a variable
It's essentially complaining about using feature wrong on purpose, because the person that made mistake never learned the language.
my($var1, $var2...) is a way to multi-assign variables from an array.
and that makes perfect sense when you look at it. Perl have no multiple returns, but if you need a function that returns 2 variables it is very easy to make it work with:
my ($bandwidth, $latency) = speedtest($host)
Perl's feature for returning different type depending on caller is definitely a confusing part but
my @lines = `fortune`
returning lines makes perfect sense for the use case (you call external commands to parse its output, and if you do that you generally want it in lines, because then you can just do
foreach my $line (`fortune`) {}
and it "just works".
Now you might ask "why make such shortcuts?". Well, one of big mistakes when making Perl is that it was also aimed as replacement for sed/awk for the oneliners, so language is peppered with "clever short ways to do stuff", and it's a pleasure to use in quick ad-hoc oneliners for CLI.... but then people try to use same cleverness in the actual code and it ends up with the unreadable mess people know Perl for.
the fact you can do
my ($first_line, $second_line, ...) = `fortune`
is just the feature being.... consistent in its use "when you give it array, it will fill it with lines from the executed command"
you gave it array, and it just did what it does with arrays.
Then don't use the low level interfaces. In Perl, language features are plug and play. Everything's in a module. Use the core module List::Util instead.
That's not super subtle any more than it's super subtle that "*" performs multiplication and "+" performs addition. Sometimes you just need to learn the language.
This is not a general defense of Perl, which is many times absolutely unreadable, but this example is perfectly comprehensible if you actually are trying to write Perl and not superimpose some other language on it.*
There's is no fair comparison to be made here with how + and * work is most languages, precisely because + and * work the same in most languages, while whatever perl is doing here is just idiosyncratic.
Even C gets it's fair share of flack for how it overloads * to mean three different things! (multiplication, pointer declaration, and dereference)
It's just very non-obvious what the code does when you're skimming it.
Especially in a dynamic language like Perl, you wouldn't know that you're passing down an integer instead of a function until the code blows up in a completely unrelated function.
You can't do that if you gave up at the very first sigil puzzle.
I'm fine with that: to program in Perl you need to be able to follow manuals, man pages, expert answers, - and even perl cookbooks, or CPAN or web searches. It's a technical tool. The swiss army chainsaw. It's worth it.
Seems like you and a few other posters are making the article's point – that Perl's culture is hermetic and that new programmers would rather learn Python, Ruby or Javascript rather than figure out which sigil means what.
I wouldn't call it hermetic in that the many forms of documentation are insanely thorough and accessible - if not well advertised. There is no gate-keeping (from my point of view). New users are welcome. It's easy to learn (for the people for whom reading is not an obstacle).
But yes, no contest that the world has been on a simplicity binge. Python won by pushing simplicity and by having giant software corporations choosing it (and not complaining about the line noise nonsense). If you want to go into programming professionally, for now many years, you need python.
I don't know that I would put Javascript in the same bag. I mean, it's the other way: it looks simple and it isn't.
But python, yes, python won because it looks simple and google pushed it.
Many other languages now have to reckon with the python supremacy. This is not specific to perl / raku. It will take work for anything to replace python.
I think it's unlikely to change because most likely the content was not available for legal reasons, not technical. That's why for example when they re-release some shows they have to switch out to completely different music – the rights were not cleared in the first place and it'd be a huge hassle to go back and negotiate with every rightholder
If your system is already running malware, why wouldn't the malware use a privilege escalation exploit (which are relatively numerous on linux) to access your data rather than some X11 flaw which depends on their code getting started by the user?
Because it's not an x11 "flaw" or exploit, it's just how X works. I also just don't buy the whole "well other stuff has exploits too" mentality.
I mean, yeah, it does, maybe. So why bother creating a password to a service if their database is probably running Linux anyway and the rdbms is probably compromised and yadda yadda yadda. It's the kind of argument you can make for anything.
Also no - privilege escalation is not "numerous" on Linux. It's very difficult to do in practice. It's only really a problem on systems built on old kernels which refuse to update. But those will always be insecure, just like running Windows 7 will be insecure.
It's a cultural difference – french culture prefers correctness over politeness, whereas in the US people prefer to "keep the peace" by not emphasizing mistakes.
It shows up a lot in engineering discussions if you have french colleagues too.
The problem is that many of these clean room reimplementations require contributors to not have seen any of the proprietary source. You can't guarantee that with ai because who knows which training data was used
> You can't guarantee that with ai because who knows which training data was used
There are no guarantees in life, but with macOS you can know it is rather unlikely any AI was trained on (recent) Apple proprietary source code – because very little of it has been leaked to the general public – and if it hasn't leaked to the general public, the odds are low any mainstream AI would have been trained on it. Now, significant portions of macOS have been open-sourced – but presumably it is okay for you to use that under its open source license – and if not, you can just compare the AI-generated code to that open source code to evaluate similarity.
It is different for Windows, because there have been numerous public leaks of Windows source code, splattered all over GitHub and other places, and so odds are high a mainstream AI has ingested that code during training (even if only by accident).
But, even for Windows – there are tools you can use to compare two code bases for evidence of copying – so you can compare the AI-generated reimplementation of Windows to the leaked Windows source code, and reject it if it looks too similar. (Is it legal to use the leaked Windows source code in that way? Ask a lawyer–is someone violating your copyright if they use your code to do due diligence to ensure they're not violating your copyright? Could be "fair use" in jurisdictions which have such a concept–although again, ask a lawyer to be sure. And see A.V. ex rel. Vanderhye v. iParadigms, L.L.C.,
562 F.3d 630 (4th Cir. 2009))
In fact, I'm pretty sure there are SaaS services you can subscribe to which will do this sort of thing for you, and hence they can run the legal risk of actually possessing leaked code for comparison purposes rather than you having to do it directly. But this is another expense which an open source project might not be able to sustain.
Even for Windows – the vast majority of the leaked Windows code is >20 years old now – so if you are implementing some brand new API, odds of accidentally reusing leaked Windows code is significantly reduced.
Other options: decompile the binary, and compare the decompiled source to the AI-generated source. Or compile the AI-generated source and compare it to the Windows binary (this works best if you can use the exact same compiler, version and options as Microsoft did, or as close to the same as is manageable.)
Are those OSes actually that strict about contributors? That’s got to be impossible to verify and I’ve only seen clean room stuff when a competitor is straight up copying another competitor and doesn’t want to get sued
ReactOS froze development to audit their code.[1] Circumstantial evidence was enough to call code not clean. WINE are strict as well. It is impossible to verify beyond all doubt of course.
I’ve been thinking a long time about using AI to do binary decompilation for this exact purpose. Needless to say we’re short of a fundamental leap forward from doing that
It's interesting because you can still find interviews of Larry online how Perl is the first postmodern programming language but for perl6 they came up with a very top-down, modernist project
Could you explain how you'd design a package manager that does not allow that? As far as I understand the moment you use third party code you have to trust to some extent the code that you will run.
reply