I remember trying out docker sometime back in late 2013. Something about it never fully stuck with me. I always felt like the final boundary for a piece software should be the process, not the computer. Plopping an entire VM into a zipfile and saying "here's your software" felt like lazy engineering to many of us at the time (and still today).
For our current stack, the answer has been to make the entire business application run as a single process. We also use a single (mono) repository because it is a natural fit with the grain of the software.
As far as I am aware, there is no reason a single process cannot exploit the full resources of any computer. Modern x86 servers are ridiculously fast, as long as you can get at them directly. AspNetCore + SQLite (properly tuned) running on a 64 core Epyc serving web clients using Kestrel will probably be sufficient for 99.9% of business applications today. You can handle millions of simultaneous clients without blinking. Who even has that many total customers right now?
Horizonal scalability is simply a band-aid for poor engineering in most (not all) applications. The poor engineering, in my experience, is typically caused by underestimating how fast a single x86 thread is and exploring the concurrent & distributed computing rabbit hole from there. It is a rabbit hole that should go unexplored, if ever possible.
Here's a quick trick if none of the above sticks: If one of your consultants or developers tells you they can make your application faster by adding a bunch of additional computers, you are almost certainly getting taken for a ride.
I agree with you. However, after spending years and years trying to compile software that came with cryptic install instructions.
Or have the author insist that since it works on their machine I 'm just doing something stupid. Docker was largely able to fix that.
It's a somewhat odd solution for a too common problem, but any solution is still better than dealing with such an annoying problem. (source: made docker the de facto cross-teams communication standard in my company. "I 'll just give you a docker container, no need to fight trying to get the correct version of nvidia-smi to work on your machine" type of thing)
It probably depends on the space and types of software you 're working on. If it's frontend applications for example then its overkill. But if somebody wants you to let's say install multiple elasticsearch versions + some global binaries for some reason + a bunch of different gpu drivers on your machine (you get the idea), then docker is a big net positive. Both for getting something to compile without drama and for not polluting your host OS (or VM) with conflicting software packages.
Completely agree. The whole compile chain for most software and reliance on linked libraries, implicit dependencies like locale settings changing behavior, basically decades of weird accidents and hacks to get around memory and disk size limits, can be a nightmare to deal with. If using slow dynamic languages, or modern frontend bunglers, all the implicit c extension compilations and dependencies can still be a pain.
The list goes on and on, it’s bizarre to me to think of this as the true, good way of doing software and think of docker as lazy. Docker certainly has its own problems too, but does a decent job at encapsulating the decades of craziness we’ve come to heavily rely on. And it lets you test these things alongside your own software when updating versions and be sure you run the same thing in production.
If docker isn’t your preferred solution to these problems that’s fine, but I don’t get why it’s so popular on HN to pretend that docker is literally useless and nobody in their right mind would ever use it except to pad their resume with buzzwords.
When library versions start to cause issues: like V3.5 having a bug, so you need to roll back to V3.4... that's when ./configure && make starts to have issues.
Yeah, it happens with .so files, .dlls ("dll hell"), package managers and more. But that's where things like containers come in to help: "I tested Library Foo version V3.4 and that's what you get in the docker". No issues with Foo V3.5 or V3.6 causing issues... just get exactly what the developer tested on their box.
Be it a .dll, a .so, a #include library, some version of Python (2.7 with import six), some crazy version of a Ruby Gem that just won't work on Debian for some reason (but works on Red Hat)... etc. etc.
There are basically two options; maintain up-to-date dependencies carefully (engineer around dll-hell with lots of automated testing and be well-versed in the changelogs of dependencies) or compile a bunch of CVEs into production software.
There really isn't any middle ground (except to not use third-party libraries at all).
That assumes you are using free software without a support contract where the vendor has no incentive to maintain long term support for libs by only applying security patches but not adding any features to old versions. I understand this goes against the culture of using only the latest or "fixing" vulns by upgrading to a more recent version (which may have a different API or untested changes to existing APIs).
That makes sense for a hobbyist community but not so much for production.
In a former job we needed to fork and maintain patches ourselves, keeping an eye on the CVE databases and mailinglists and applying only security patches as needed rather than upgrading versions. We managed to be proactive and avoid 90% of the patches by turning stuff off or ripping it out of the build entirely. For example with openSSH we ripped out PAM, built it without LDAP support, no kerberos support etc. And kept patching it when vulns came out. You'd be amazed at how many vulns don't affect you if you turn off 90% of the functionality and only use what you need.
We needed to do this as we were selling embedded software that had stability requirements and was supported (by us).
It drove people nuts as they would run a Nessus scan and do a version check, then look in a database and conclude our software was vulnerable. To shut up the scanners we changed the banners but still people would do fingerprinting, at which point we started putting messages like X-custom-build into our banners and explained to pentesters that they need to actually pentest to verify vulns rather than fingerprinting and doing vuln db lookups.
Point being, at some point you need to maintain stuff and have stable APIs if you want long lasting code that runs well and addresses known vulns. You don't do that by constantly changing your dependencies, you do it by removing complexity, assigning long terms owners, and spending money to maintain your dependencies.
So either you pay the library vendor to make LTS versions, or you pay in house staff to do that, or you push the risk onto the customer.
Aren't we conflating compile complexities with runtime complexities here? There are plenty of open-source applications that offer pre-compiled binaries.
That difference isn't as black and white as you're making it out to be, sometimes it's just a design decision whether certain work is done at compile time or runtime. And both kinds of issues, runtime or compile-time, can be caused by the kinds of problems I'm talking about like unspecified dependencies.
This is why I wish github actually allowed automated compilation. That way we could all see exactly how binaries are compiled and don't need to setup a build environment for each open source project we want to build ourselves.
I am totally on board with the idea of improving productivity. The issue I see is that this is avoiding a deeper problem - namely that the software stack requires a max-level wizard to set up from scratch each time.
Refactoring your application so that it can be cloned and built and ran within 2-3 keypresses is something that should be strongly considered. For us, these are the steps required to stand up an entirely new stack from source:
0. Create new Windows Server VM, and install git + .NET Core SDK.
1. Clone our repository's main branch.
2. Run dotnet build to produce a Self-Contained Deployment
3. Run the application with --console argument or install as a service.
This is literally all that is required. The application will create & migrate its internal SQLite databases automatically. There is no other software or 3rd party services which must be set up as a prerequisite. Development experience is the same, you just attach debugger via VS rather than start console or service.
We also role play putting certain types of operational intelligence into our software. We ask questions like "Can our application understand its environment regarding XYZ and respond automatically?"
The issue Docker solves for me is not the complexity or number of steps but the compatibility.
I built a service that is installed in 10 lines that could be ran through a makefile, but I assume specific versions of each library of the system and don’t intend to test against the hundreds of possible system dependencies combinations or assume it will surely be compatible anyway.
The dev running the container won’t building their own debian installs with the specific version required in my doc just to run the install script from there, they just instanciate the container and run with it.
Linux containers and equivalent technologies are virtualisation (specifically OS virtualisation[1]), just not a VM. Hardware virtualisation (VMs) isn't the only kind of virtualisation that exists.
It's true that Docker isn't a first-class abstraction at the level of the Linux kernel, but BSD has jails, and Solaris has Zones. This is important in some respects, but I don't see that it informs things here. Containers are still 'a thing' regardless of how they're implemented.
Curious to learn more about how jails + zones are implemented. In Linux land, I find the notion that containers are a coherent abstraction really hinders developers from understanding how their application is deployed.
OCI “docker” containers are at this point are a description of a process. How it’s realized is up to the implementor. runc realizes the container with kernel namespaceing and runv realizes the container with hardware virtualization.
Both if implemented to spec will be logically equivalent and drop-in replacements for one another.
I don't know if that is the most important reason it is not equivalent. It also doesn't have any system processes; there is no systemd, sshd, no crond, etc. It doesn't need its own firewall rules configured or its own security managed. I could go on but I think you already get the point.
I agree with the other commentor, this is an important distinction. It has no hypervisor and is really just a normal process using standard kernel features: cgroups (resource limiting) and namespaces (resource isolation). It's really not so different to chroot.
Containers solve the dependency problem by simply pretending it doesn't exist.
I used to do UNIX integration work in the late 1990's early 2000's and containers weren't really a thing. So you had to make sure libs from one program didn't crap on another program. And developers had to be conscious of what dependencies they included in their code. Nowadays they don't have to care as much because of containers. Every program can have its own dependencies. Thereby solving the integration problem.
A better solution would be to actually integrate programs and their dependencies into working systems, but no one has time for that. Software bloat is fine. Computers are cheap and fast. And actually understanding what we're doing would be too expensive. So just wrap all your have finished crapware up in a giant black box and dump it on a server.
I'm very interested in understanding what I'm doing and what I'm bundling into my software.
What I'm not interested in, is this kind of walking uphill in the snow both ways:
> So you had to make sure libs from one program didn't crap on another program. And developers had to be conscious of what dependencies they included in their code.
ie. having to understand what everybody else is doing in order for my software to run properly. No thanks. That's not why I'm here.
I'll put the exact dependencies I want, in the versions which work best for my software, into a Docker image or whatever tool offers a similar level of isolation, and I'll be working on my code while everybody else spends their time fighting over the ABI compatibility of C system libraries.
> Horizonal scalability is simply a band-aid for poor engineering
And don't even get me started on having instances labeled "large" that have less memory and CPU capacity than my personal backup laptop (currently on loan to my 8yo for COVID reasons)...
But that doesn’t make any sense. We’re not talking about physical hardware we’re talking about tiny tiny slices of it. When VMs are the logical isolation boundary in your infra they get really small — 512 MB is a lot of memory for a single purpose server.
> 512 MB is a lot of memory for a single purpose server.
Maybe. But when the time comes 512 MB doesn't seem like much anymore, what do you do? Do you pick the next larger instance or do you split the load across more 512 MB slices of a computer?
wrt horizontal scaling. Do recall that the motivation for this strategy by Google was cheap-as-possible servers that failed constantly. Back when they built racks using legos the problem was hardware reliability. They had to 'scale' horizontally for reliability (given cost constraints) as much as load.
People have since bought into the marketing reasons for 'being in the cloud' and having 'infinite scalability' but that largely misses the point (and the pain) that caused many of these technologies and patterns to be developed in the first place.
> Horizonal scalability is simply a band-aid for poor engineering in most (not all) applications.
Maybe if your app handles < 10k concurrent connections. Otherwise it is the most cost efficient solution and exists because it solves the scaling problem in the best way as of today.
When you approach the limits of what your kernel can handle, then it may be time to split your workload across boxes or to carve smaller boxes out of your metal (and probably directly attaching NICs to the VMs to the host OS doesn't have to deal with them). Making your workload horizontally scalable is always a sound engineering choice.
But...
Splitting a horizontally scalable workload across a dozen virtual servers that are barely larger than the smallest laptop you can get from Best Buy, you are just creating self-inflicted pain. Chances are the smallest box you can get from Dell can comfortably host your whole application.
The fact remains the odds of you needing to support more than 10K simultaneous connections are vanishingly small.
>The point being that masses of software is developer everyday on a cargo cult adoption of solution they do not require.
This is certainly true, but there is a possible benefit: standardization. Having a standard skillset allows employees greater flexibility since they can jump employers and still expect to be rapidly useful. Similarly, if your company uses a standard toolkit, there's going to be less training overhead for new hires. Now, the devil is in the details, and I'm inclined to agree that you'd be better off hiring someone that can think outside the box and keep the tooling simpler. But using the standard toolkit will work reasonably well across several orders of magnitude in scale.
10k? It's not 1999 anymore. Look at Netflix to see the state of the art in saturating NICs with commodity hardware and off the shelf NGINX plus FreeBSD.
10k concurrent idle connections is no problem, but 10k/rps is a decent amount of traffic. What is a typical CURD app? It really depends on what software you're using too. You can do >10k/rps with DB read/write on every request with PHP on a lowend server, but if you throw a heavy framework into the mix then that would not be possible.
agree with you, when you say cost efficient it means - that we can scale out our poorly written slow software to more servers to handle increased traffic, instead of hiring more expensive engineers and rewriting software properly than could handle increased load from a single instance
I get where you're coming from, but it turns out that plopping the entire VM into a zipfile ends up being a good way to have the kind of reproducibility that makes your operations sane. Do you pin specific versions of dependencies for your install scripts? Might as well pin the whole image. It's like 88% of the benefit of reproducible builds, at 3% of the cost, and that's not nothing.
Mind you, that's still Docker, though, not Kubernetes.
> If one of your consultants or developers tells you they can make your application faster by adding a bunch of additional computers, you are almost certainly getting taken for a ride.
Eh. There's a redundancy play in there somewhere too, if you know how to pull it off. (Big if.)
Horizontal scaling brings down total compute use if you have many distinct uses.
Imagine of your only unit of compute is a single bulky machine. You don't fully saturate it, but you need a second machine to avoid downtime anyway. Now you spin up a second or third service and suddenly you need 5 or ten machines and your compute utilization is 20%. You can pack things in tighter. But then you have a knapsack problem, and that's easier to solve efficiently with many small blocks even if it costs you a 1% overhead or whatever.
> Plopping an entire VM into a zipfile and saying "here's your software" felt like lazy engineering to many of us at the time (and still today).
I've seen this (a long time ago) in the education world market. Very small school with a STEM program. They had specific scientific software they wanted undergrads to use (and some of it was pretty proprietary and used to interface with lab equipment) + a pre-configured IDE.
Instead of going through the compatibility matrix of OS and their versions they just gave all students a VM image that would "just work". Everyone could bring in their own devices and as long as you could run an hypervisor everything would "just work".
Horizontal scaling is not a _performance_ tactic, but rather about availability and cost. Having higher availability means trading off consistency, or in other words, using distributed systems.
Also, you can not elastically scale vertically, without also scaling horizontally.
In other words, horizontal systems are cheaper if you have fluctuations in traffic.
Stackoverflow has always run on a couple of IIS instances. If you’re not bigger than them don’t worry.
You can pretend to be Netflix or Google, and build your tech-stack like they do. Or you can stop wasting your resources setting up a tech stack that you’re never going to get a return of investment on.
Why a few though? Couldn't they do with just 1..? that would make people on HN more happier it seems.
Stack Overflow is not a unit of measurement that anyone would be able to take seriously or find useful? How many stack overflows is one asana? Or how many stack overflows is one trello?
Horizontal scaling, docker K8s have their own benefits that are many and obviously to the industry. you don't need to be google to deploy and use them. If you deploy one server for each app and each team vs deploying a common K8s cluster where is the higher investment? You claim more ROI with more physical hardware and more servers?
> Horizontal scaling, docker K8s have their own benefits that are many and obviously to the industry.
Which is why SO runs on more than one IIS...
You don’t need a tech-stack, that is apparently even too complex for google considering the article, to scale horizontally.
> If you deploy one server for each app and each team vs deploying a common K8s cluster where is the higher investment?
The investment comes from the complexity. We’ve seen numerous proofs of concepts in my country, and in my sector of work, where different IT departments spent one or two 2-5 full years worth of man hours trying to adopt a perfect devops tech-stach.
Maybe that’s because they were incompetent, you’re free and possible right to claim so, but that’s still professional teams expending real world resources and failing.
From a management perspective, and this is where I’m coming from much more than a technical perspective mind you, the most expensive resource you have is your employees. If software is so complex that I need one or two full time operators to run it, well, let’s just say I could run more than a million azure web apps, and have our regular Microsoft certified operators handle it.
> You claim more ROI with more physical hardware and more servers?
I haven’t owned my own iron since 2010. All our on-prem servers, and we still do have those, are virtual and running on rented iron.
I think we may be speaking past each other though. My point is financial and yours appear to be mostly technical. If you can set up and run your K8s without expending resources, then good for you, a lot of companies and organisations have proven to be unable to do that though, and in those cases, I think they would’ve been better off not doing it, until they needed to.
Kubernetes is not too complex. There are things to learn no doubt, but it's easy to reason about once you cross the initial learning curve.
Ofcourse transitions can fail. People can think yea let's do this small thing and end up chewing off a much bigger problem than they thought they were getting into. But that problem is in the whole of tech. "Let's just use our present people and switch from all proprietary to all open source in 3 months.." yea, best of luck with that... You need a solid team and going all in on K8s is hard, you need technical talent and leadership to drive this.
Agreed, maybe it may not be for everyone. Benefits are both technical and financial, less compute resources used, more reliable deploys, more resilient services. The problems being solved by this are not trivial. There are tangible benefits. Is it a risk? Ofcourse it is. The risk is not in the technology, the risk is in the competence of the team deploying it. If it can't change and adapt, maybe a lot more fundamental things need to change in that organization than just deploying a new orchestration layer.
My only point is, this shift from dedicated servers to VMs and now to containers is a fundamental shift in how things are done. People can hate on it all they like, but it's a better way of doing things and everyone will catch-up eventually.
I shall not express any opinion on this topic other than to say that Trello is not a good example to bring up. The entire customer base of Trello is not using a single shared board and thus they could scale in any direction they wanted to maximize ROI.
"You can handle millions of simultaneous clients without blinking. Who even has that many total customers right now?"
Apparently netflix has that many customers. Then again, if you split Netflix into regions and separate all the account logic from the streaming, the recommendations engine and the movie-content, you could perhaps run the account logic for one region in one server.
Ofcourse, everything runs on servers. Question is, who owns and maintains the hardware.
The link you shared just says they manage their OS layer, ofcourse they do. Everyone running on AWS VMs is responsible for their own OS layer. Wether they want precise control over their OS doesn't change their preference for who owns and manages the hardware..
I suppose your apps ran on windows. It isn't a problem (or at least it's a smaller problem) with windows ecosystem, especially enterprise one, since usually you're using same version and installation steps handled by infra with AD. Even without AD usually the installer and windows version is same, and Microsoft usually great at backwards compatibility.
But it's not the case in linux, or at least non-enterprise / ldap linux. Installing mysql/redis/elastic/dotnet core/etc on each machine may be different, and has different installer available.
With docker I just need to instruct them to install docker, setup docker compose and everything is handled via containerization.
Agreed to an extent. Even outside RHEL, you have yum for RPM-based distros, Aptitude for deb-based distros, pacman for Arch, etc.
You need to start with the same base operating system, and you need to make sure that you pull in the same versions of packages in case there are backwards-incompatible bugs, or version bumps such that the dynamically-loaded library is no longer detected (hence the common albeit dangerous workaround of "add a symlink").
(If you're using rpm directly, then you need to bundle the actual packages that you're installing, or point to specific packages that you're confident won't change. And at that point, what's the difference between your approach and Docker?)
The challenge that I believe Docker solves (or, at least, attempts to) is environment reproducibility: without it, you have dependency hell.
For our current stack, the answer has been to make the entire business application run as a single process. We also use a single (mono) repository because it is a natural fit with the grain of the software.
As far as I am aware, there is no reason a single process cannot exploit the full resources of any computer. Modern x86 servers are ridiculously fast, as long as you can get at them directly. AspNetCore + SQLite (properly tuned) running on a 64 core Epyc serving web clients using Kestrel will probably be sufficient for 99.9% of business applications today. You can handle millions of simultaneous clients without blinking. Who even has that many total customers right now?
Horizonal scalability is simply a band-aid for poor engineering in most (not all) applications. The poor engineering, in my experience, is typically caused by underestimating how fast a single x86 thread is and exploring the concurrent & distributed computing rabbit hole from there. It is a rabbit hole that should go unexplored, if ever possible.
Here's a quick trick if none of the above sticks: If one of your consultants or developers tells you they can make your application faster by adding a bunch of additional computers, you are almost certainly getting taken for a ride.