The simple spoiler is that the GPU machines use Cloud Hypervisor, not Firecracke...

yencabulator · on Feb 16, 2024

There has been weirdly little discussion on HN about Cloud Hypervisor. I guess because it's such a horribly bland non-descriptive Enterprise Naming name?

It looks pretty sweet. Rust & sharing libraries with Firecracker and ChromeOS's crosvm, with more emphasis on long-running stateful services than in Firecracker.

https://github.com/cloud-hypervisor/cloud-hypervisor

https://github.com/rust-vmm

nolist_policy · on Feb 16, 2024

Unfortunately, Cloud Hypervisor does not use strong sandboxing/privilege separation like crosvm does.

yencabulator · on Feb 16, 2024

For anyone else wanting to check on the status of this: it seems they're looking at a combination of seccomp, landlock and a systemd service instance per VM, with systemd doing DynamicUser, namespacing, and initial seccomp. Work seems to be happening right now, but of course it's telling and sad that it wasn't part of the original design.

https://github.com/cloud-hypervisor/cloud-hypervisor/issues/...

niz4ts · on Feb 14, 2024

Way simpler than what I was expecting! Any notes to share about Cloud Hypervisor vs Firecracker operationally? I'm assuming the bulkier Cloud Hypervisor doesn't matter much compared to the latency of most GPU workloads.

tptacek · on Feb 14, 2024

They are operationally pretty much identical. In both cases, we drive them through a wrapper API server that's part of our orchestrator. Building the cloud-hypervisor wrapper took me all of about 2 hours.