> It's also kind of the wrong tool for the job anyways. I'd love to see tools that are more built to purpose for what you are _actually doing with the servers_ instead of giving people a REPL and telling them to go nuts.
On the other hand, if you knew what I'd need to do with the servers, I probably wouldn't need to do it. It's really hard to investigate problems without tools. Nobody is going to make a graph for how much cpu poorly configured docker/datadog are wasting rereading the same 1GB of json logs once or twice a minute, because nobody thought it would be meaningful, but watching top in one window, and running lsof at the right time to see what's being read, and whatever performance tools are hanging out to confirm.
Ah the dream of easy to administer, perfectly adapted fine grained access permissions that can predict the future of exactly what an authorized user can run and needs to do.
It never works because of the lack of precog, ticket walls, escalation over policy requiring reviews, and other bureaucracy.
What they end up replacing ssh with (teleport or aws ssm are the two I've been forced to use) are slow, spotty, crash frequently, and who knows what vulnerabilities they have.
If all these companies trying to get rid of sshd want to slap their own daemon on to replace it, why not keep sshd and simply have a deamon that manages the keys and not throw out the entire tool chain?
The most fun thing with teleport and aws-ssm: to use it you had to bring up a web page to authenticate. There goes any hope of automation at scale. Because in a world where you're supposed to treat servers like cattle, let's impose (and I mean impose) a regime that forces you to access them one at a time manually, especially in critical situations.
On the other hand, if you knew what I'd need to do with the servers, I probably wouldn't need to do it. It's really hard to investigate problems without tools. Nobody is going to make a graph for how much cpu poorly configured docker/datadog are wasting rereading the same 1GB of json logs once or twice a minute, because nobody thought it would be meaningful, but watching top in one window, and running lsof at the right time to see what's being read, and whatever performance tools are hanging out to confirm.