More

eliribble · on Nov 13, 2024

Weird editorialization. I included a TL;DR at the top, you could have just copy-pasted it.

"Docker-compose is a tool for working with Docker containers. It solves very real problems with deploying complex applications. By itself it is not enough to make self-hosting applications simple enough for the mass-market."

eliribble · on Nov 13, 2024

Looks like https://dockerswarm.rocks says that the site is deprecated.

https://dockerswarm.rocks/swarm-or-kubernetes/ says "it's not sensible to build a new product using Docker Swarm Mode"

raphinou · on Nov 13, 2024

That's indeed the opinion of the author. Note however that at this time all elements used in the setup described on dockerswarm.rocks are maintained. I started using swarm in 2022 and I documented my decision [1], and my reasoning for my kind of needs as not changed. The investment is very low, as well as the risk. Migrating away from swarm should not be very problematic for me, and in the meantime I enjoy an easy to maintain setup. I still think it's better than tweaking a maybe working solution with compose.

I'm not expecting to convince anyone but wanted to share an alternative approach(only applicable to certain setup)

1 https://www.yvesdennels.com/posts/docker-swarm-in-2022/

eliribble · on Nov 13, 2024

Interesting, I wasn't aware that Traefik could do this without significant modification to the docker-compose configuration provided by the application developer. I also thought that Traefik required some sort of higher-level container orchestration like Docker Swarm or Kubernetes.

I'll have to look in to that.

eliribble · on Nov 13, 2024

These are great points, and probably worth their own blog post to answer.

> First of all, why is this a concern? Idle threads have basically no impact on a system, so this generally isn't a concern

Idle threads have very low impact on CPU utilization, probably, if the application is well-behaved (and I expect most databases and caching layers to be well-behaved in this way). The application itself, however, will need memory and the way containers are built prevents the usual de-deplication of system libraries.

> but generally it won't even make the list of top-100 performance bottlenecks for most applications

True, but it makes the short list of "how much stuff can I run on a $100 computer", and it's one of the relatively few concerns an application operator has when they are not the application developer.

> Even if you wanted to enforce your applications sharing a postgres instance under the hood, why would you want that to be black-magic performed by the container orchestrator?

To make self-hosting much simpler. If the container orchestrator doesn't do it, what do you think should do it?

> Other stuff like DB backups just don't seem like issues docker compose users have. If you need to orchestrate across multiple nodes in order to meet your SLOs, then don't use docker compose.

The DB backups are meant for disaster recovery rather than supporting multiple nodes. I guess that's multiple nodes through time... But, yeah, I agree, docker-compose is not a good fit.

> Finally, it seems like the actual solution is significantly under-discussed. I both have tons of questions about how it's supposed to work, and I see lots of shortcomings with the parts that I do understand.

Yeah, agreed, I'll be writing other things to discuss what I think the correct solution should be. I'm curious to find out if other people have existing solutions to the problems I outlined. If it's a solved problem and I just don't know about it, that'd be better.

> I'd be interested to read an article which tried to articulate why such an opinionated API would improve SDLC-considerations over docker-compose, but I don't think that's the article I just read.

It is not, and you're right, it needs discussion.

mkarrmann · on Nov 13, 2024

Ty, all good points

eliribble · on Nov 13, 2024

While a bit of a hot take, you're not wrong. We need something that's less scalability focused than Kubernetes/Mesos/Docker Swarm but that doesn't put too much burden on application developers. Something that focuses on being secure, reliable, and understandable, in that order. I'm not aware of anything going for that niche. That means a new tool is in order.

datadeft · on Nov 13, 2024

I think we need a composable system but I am not sure if the current frame where this problem and these tools exist is good enough. We might need to rethink how we handle access and usage patterns well. I only have wrong answers. Docker compose is amazing for local dev env, k8s is terrible for production. These are my experiences with this domain.

eliribble · on Nov 13, 2024

> The way they frame that pihole example ("Whew! That’s a lot of stuff!") is just silly.

Yeah, you're probably right. Originally that line was in there when I had a breakdown of what each line in the docker-compose was doing. My editor thought that was unnecessary - it's unlikely people reading the post would need that kind of breakdown. So I rewrote parts to assume more baseline knowledge. I should have noticed that line and taken it out.

You're right about what we're trying to do, and I agree that the post doesn't really help someone be successful today deploying things. The post is more meant to gauge whether or not I'm alone in having pain deploying a couple dozen services with docker compose on a single box.

I want more people to have the power to host their own services. I think we can do that, but we have to figure out the right thing to build to do it.

eliribble · on Nov 13, 2024

> I would say if the author was actually interested in solving this problem in a productive way they should first try to see if docker itself is amenable to altering their constructs to provide optional higher abstractions over common concepts via the compose interface natively.

docker-compose is a lot of things to a lot of people. When it was created I doubt anyone realized it would eventually be the de facto standard for deploying to homelabs. It's an amazing tool, but it could be better for that specific use. I don't think that segment is important enough to the team that maintains it to warrant the change you're suggesting.

eliribble · on Nov 13, 2024

Yeah, we're very early building this, the blog post is just a way for me to organize my thoughts and start fights online. It's, uh, embarrassingly useful to yell semi-coherent thoughts into the void and have experts yell back with a decade or more of experience and information about tools I haven't heard of.

> I'm quite skeptical that adding a layer of abstraction and switching to TOML instead of YAML will suddenly enable those scared away by compose to start self-hosting, but kubernetes and docker swarm were never in the cards.

Yes, this is an excellent point. I did not articulate it well anywhere, but the goal is for users to have something more like Sandstorm, with a UI to install things. The TOML is for application developers, not end users. It'll either go in a separate database or, ideally, in the source code of the applications to be installed similar to a Dockerfile. I haven't started yet, but eventually we need to work with application developers to support things they want and to make it easier to treat Tealok as the "easy option" rather than docker compose.

lolinder · on Nov 13, 2024

Oh, that makes way more sense! Yeah, that actually sounds like it could work well if you can get buy in from application devs.

The trickiest thing doing it this late in the game is going to be that docker compose has truly become the standard at this point. I self-host a ton of different apps and I almost never have to write my own docker compose file because there's always one provided for me. At this point even if your file format is objectively better for the purpose, it's going to be hard to overcome the inertia.

eliribble · on Nov 13, 2024

Yeah, I agree, we're going to need a really compelling use-case not just for end users that run the application, but for the application developers as well. Nobody wants to maintain 3+ extra deployment files for the various also-rans competing with docker-compose.

What do you use to manage all those compose files? Do you have off-site backups? I'm constantly reading and re-writing docker-compose and bash scripting everything to fit in with the rest of my infrastructure it'd be good to hear about someone with a better way.

lolinder · on Nov 13, 2024

I have a single GitHub repo that contains all the compose files for my main server. Each application gets a folder with the compose file and any version-controllable configuration (which gets bound to volumes in the docker containers).

I periodically run Renovate [0], which submits PRs against the infrastructure repo on my local Forgejo to update all my applications. I have a script in the repo which pulls the git changes onto the server and pulls and restarts the updated apps.

Data is all stored in volumes that are mapped to subfolders in a ~/data directory. Each application has a Borgmatic [1] config that tells Borgmatic which folder to back up for that app and tells it to stop the compose file before backup and resume it afterwards. They all go to the same BorgBase repository, but I give each app its own config (with its own retention/consistency prefix) because I don't want to have network-wide downtime during backups.

At the moment the backup command is run by me by hand, with BorgBase configured to send me emails if I forget to do it for a week. Eventually that will be a cron job, but for now it takes less time to just do it myself, and I don't change my data often enough for a week of lost work to hurt much.

All the applications bind to ports which are firewalled, with Caddy and Pihole being the only applications that run on exposed ports (53, 80, 443). Caddy has a wildcard DNS cert from LetsEncrypt for HTTPS and directs traffic from a bunch of local domain names to the correct applications. I just use Pihole to define my local DNS names (custom.list, which is where Pihole keeps the local DNS definitions, is a volume that's committed to the repo).

[0] https://github.com/renovatebot/renovate

[1] https://torsion.org/borgmatic/

eliribble · on Nov 13, 2024

I'm actually a fan of Sandstorm, and think it got a lot of things right. I'd love to be able to talk to Kenton Varda about why he thinks adoption on it was weak. Personally I think that it put a bit too much burden on application developers since it required them to develop applications specifically for sandstorm.

> I'm skeptical that what the non-technical self-hoster needs is a TOML DSL that abstracts away ports

I fully agree, the end user would not be writing TOML DSL files. The end user would get something much closer to an app store, or what Sandstorm did, with one (or a few) click installs. The TOML DSL would be written by developers familiar with the application and stored either in a separate database, or ideally in the applications source control like a Dockerfile.

kentonv · on Nov 13, 2024

> I'd love to be able to talk to Kenton Varda about why he thinks adoption on it was weak.

Oh hai.

Honestly I'm not sure I'm a reliable source for why we failed. It's tempting to convince myself of convenient excuses.

But I really don't think the problem was with the idea. We actually had a lot of user excitement around the product. I think we screwed up the business strategy. We were too eager to generate revenue too early on, and that led us to focus efforts in the wrong areas, away from the things that would have been best for long-term growth. And we were totally clueless about enterprise sales, but didn't realize how clueless we were until it was too late (a classic blunder). Investors really don't like it when you say you're going to try for revenue and then you don't, so we were pretty much dead at that point.

eliribble · on Nov 13, 2024

Oh, hey, holy shit! You're one of my heroes. I've read through your internal discussions on protobuf within Google, you did amazing work there, and held your own against a difficult political environment.

It sounds like you have no criticism of the technical approach, then, but rather just the business mechanics? That's eye-opening, given how much has changed in self-hosted deployment since Sandstorm started. If you started something similar today, ignoring business needs, would you build something technically similar?

kentonv · on Nov 13, 2024

Oh, well, thank you!

You should probably take me with a grain of salt: of course, I, the technical architect of Sandstorm, still think the technical architecture is great. ;) Whereas I never claimed to be good at business so I'm not adverse to reporting I am bad at business.

But yeah I still think it's the right architecture. Still believe in what's written here: https://sandstorm.io/how-it-works

But I do think there is a lot of work needed to get to a baseline of functionality that people will use. Bootstrapping a new platform is hard and needs a long runway. Could you find investors willing to back it? I don't know. I hated fundraising, though.

Of course, there are many lower-level details I would change, like:

* Consider using V8 isolates instead of containers. Sandstorm had a big problem with slow cold starts and high resource usage which V8 isolates could do much better with. It would, of course, make porting existing apps much harder, but ports to Sandstorm were always pretty janky... maybe would have been better to focus on building new apps that target Sandstorm from the start.

* Should never have used Mongo to store platform metadata... should have been SQLite!

* The shell UI should have been an app itself.

* We should never have built Blackrock (the "scalable" version of Sandstorm that was the basis for the Oasis hosting service). Or at least, it should have come much later. Should have focused instead on making it really easy to deploy to many different VPSes and federate per-user instances.

eliribble · on Nov 13, 2024

Hey, blog post author here.

I'm curious how that last sentence was going to end.

Let's say I agree with you and that TLS termination is not a container orchestration responsibility. Where does the responsibility of container orchestration start and TLS termination end? Many applications need to create URLs that point to themselves so they have to have a notion of the domain they are being served under. There has to be a mapping between whatever load-balancer or reverse proxy you're using and the internal address of the application container. You'll likely need service discovery inside the orchestration system, so you could put TLS termination inside it as well and leverage the same mechanisms for routing traffic. It seems like any distinction you make is going to be arbitrary and basically boil-down to "no true container orchestration system should care about..."

In the end we all build systems to do things to make people's lives better. I happen to think that separating out backups and managing ports as an exercise for the deployment team raises the barrier to people that could be hosting their own services.

I could be totally wrong. This may be a terrible idea. But I think it'll be interesting to try.

> If he would have done so, in the very least he would have eventually stumbled upon Traefik which in Docker solves absolutely everything he's complaining about

I'm aware of Traefik, I ran it for a little while in a home lab Kubernetes cluster, and later on a stack of Odroids using k3s. This was years ago, so it may have changed a lot since then, but it seemed at the time that I needed an advanced degree in container orchestration studies to properly configure it. It felt like Kubernetes was designed to solve problems you only get above 100 nodes, then k3s tried to bang that into a shape small enough to fit in a home lab, but couldn't reduce the cognitive load on the operator because it was using the same conceptual primitives and APIs. Traefik, reasonably, can't hide that level of complexity, and so was extremely hard to configure.

I'm impressed at both what Kubernetes and k3s have done. I think no home lab should run it unless you have an express goal to learn how to run Kubernetes. If Traefik is as it was years ago, deeply tied to that level of complexity, then I think small deployments can do better. Maybe Caddy is a superior solution, but I haven't tried to deploy it myself.

thecrash · on Nov 13, 2024

If you want an HTTPS ingress controller that's simple, opinionated, but still flexible enough to handle most use cases, I've enjoyed this one: https://github.com/SteveLTN/https-portal

chipdart · on Nov 13, 2024

> Let's say I agree with you and that TLS termination is not a container orchestration responsibility.

It isn't. It's not a problem, either. That's my point: your comments were in the "not even wrong" field.

> (...) It seems like any distinction you make is going to be arbitrary and basically boil-down to "no true container orchestration system should care about..."

No. My point is that you should invest some time into learning the basics of deploying a service, review your requirements, and them take a moment to realize that they are all solved problems, specially in containerized applications.

> I'm aware of Traefik, I ran it for a little while in a home lab Kubernetes (...)

I recommend you read up on Traefik. None of your scenarios you mentioned are relevant to the discussion.

The whole point of bringing up Traefik is that it's main selling point is that it provides support fo route configuration through container tags. It's the flagship feature of Traefik. That's the main reason why people use it.

Your non sequitur on Traefik and Kubernetes also suggests you're talking about things that haven't really clicked with you. Traefik can indeed be used as an ingress controller in Kubernetes, but once deployed you do not interact with it. You just define Kubernetes services, and that's it. You do interact directly with Traefik if you use it as an ingress controller in Docker swarm mode or even docker-compose, which makes your remark even more baffling.

> I'm impressed at both what Kubernetes and k3s have done. (...) If Traefik is as it was years ago,(...)

Kubernetes represents the interface, as well as the reference implementation. k3s is just another Kubernetes distribution. Traefik is a reverse proxy/load balancer used as an ingress controller in container orchestration systems such as Kubernetes or Docker swarm. The "level of complexity" is tagging a container.

Frankly, your comment sounds like you tried to play buzzword bingo without having a clue whether the buzzwords would fit together. If anything, you just validated my previous comment.

My advise: invest some time reading on the topic to go through the basics before you feel you need to write a blog post about it.