> then just set Delegate=yes in the unit file of your manager
In runc we don't have a dedicated manager or long-running daemon. Yes, Docker and cri-o use Delegate=yes (so I am quite aware of this option) but that really doesn't help people who are using runc in their own user sessions or wrote their own wrapper and aren't aware of Delegate=yes.
I get that we are quite odd, and don't fit into a system-service model. After all of the back-and-forth with both you and Tejun (especially when it comes to "rootless" delegation -- which systemd only offers if you get a privileged user to delegate for you), I'm not sure that there's much I can do on this topic. I get that what I care about is not something you care about, but I would hope you accept that I'm not just being obstinate for the sake of it.
> Not a single API call, no C call, no D-Bus call, nothing.
Right, unless you need to set this up for someone else. And we have code that does this too -- I don't really recommend people use it, but it is necessary (and I'm pretty sure some folks at Red Hat use it based on how many bug reports they submit related to it).
Since systemd is managing the entire cgroupv2 tree (and the fact we can get around that for cgroupv1 appears to be seen as a design flaw by both you and Tejun), obviously we have to talk to systemd to do this type of thing. I just wish this wasn't the way it was done (and if cgroupv2 had a named cgroup concept -- which is what systemd needs for tracking services -- I would think that this wouldn't be such a pain-point).
I guess I'm just annoyed that we can't use "better rlimits" with "rootless" container runtimes because of all of this.
> I wished you'd actually listen to what I keep repeating to you.
I am listening, and I am aware of Delegate=yes and all of that history. But as I outlined above, I don't necessarily agree with it entirely. And unlike a lot of people around here, I don't think any of these pain-points are coming up because of malice or something stupid like that -- I just think we disagree on our priorities.
> We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean
Don't get me wrong -- I do appreciate that we have Delegate now (there was a period of several years where "systemd decided to reorganise the cgroup tree, un-containing my containers" happened on several occasions -- and Delegate solved those issues).
And from what I've heard from the LXC folks, you were quite reasonable about getting systemd to work inside LXC. Which is good to hear.
> The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.
We do basically the same thing. We just don't support cgroupv2.
In runc we don't have a dedicated manager or long-running daemon. Yes, Docker and cri-o use Delegate=yes (so I am quite aware of this option) but that really doesn't help people who are using runc in their own user sessions or wrote their own wrapper and aren't aware of Delegate=yes.
I get that we are quite odd, and don't fit into a system-service model. After all of the back-and-forth with both you and Tejun (especially when it comes to "rootless" delegation -- which systemd only offers if you get a privileged user to delegate for you), I'm not sure that there's much I can do on this topic. I get that what I care about is not something you care about, but I would hope you accept that I'm not just being obstinate for the sake of it.
> Not a single API call, no C call, no D-Bus call, nothing.
Right, unless you need to set this up for someone else. And we have code that does this too -- I don't really recommend people use it, but it is necessary (and I'm pretty sure some folks at Red Hat use it based on how many bug reports they submit related to it).
Since systemd is managing the entire cgroupv2 tree (and the fact we can get around that for cgroupv1 appears to be seen as a design flaw by both you and Tejun), obviously we have to talk to systemd to do this type of thing. I just wish this wasn't the way it was done (and if cgroupv2 had a named cgroup concept -- which is what systemd needs for tracking services -- I would think that this wouldn't be such a pain-point).
I guess I'm just annoyed that we can't use "better rlimits" with "rootless" container runtimes because of all of this.
> I wished you'd actually listen to what I keep repeating to you.
I am listening, and I am aware of Delegate=yes and all of that history. But as I outlined above, I don't necessarily agree with it entirely. And unlike a lot of people around here, I don't think any of these pain-points are coming up because of malice or something stupid like that -- I just think we disagree on our priorities.
> We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean
Don't get me wrong -- I do appreciate that we have Delegate now (there was a period of several years where "systemd decided to reorganise the cgroup tree, un-containing my containers" happened on several occasions -- and Delegate solved those issues).
And from what I've heard from the LXC folks, you were quite reasonable about getting systemd to work inside LXC. Which is good to hear.
> The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.
We do basically the same thing. We just don't support cgroupv2.