Don’t the hyperscaled cloud providers run totally segmented networks? What’s stopping them from using something proprietary internally and just exposing TCP at the end for termination of client connections?
I’m not aware of them using something other than TCP internally (I’m sure by now they’ve migrated to QUIC but I’m not sure that QUIC necessarily solves some of the scaling challenges / optimizes for gRPC and low latency).
Google is using remote memory accesses rather than TCP for at least some classes of traffic (e.g. a caching system). They've been publishing details about how it all works too.
Also, they have a transport (Pony express) developed specifically for RPCs, rather than byte streams or datagrams.
I could be wrong but I believe they have a unified address space. There’s dedicated hardware that then owns a given memory range. On an access it will fetch it from the remote location matching that address on demand and store it in real memory in space allocated to it. Presumably it evicts stuff if there’s insufficient memory. Once the memory is brought over either a virtual address range is remapped to point to main memory or the ASIC just has a TLB itself.
This is pure speculation based on seeing the word ASIC in one of the summaries but it seems like it could be reasonable.
I don't think Google-internal communications happen over gRPC. Maybe the protocol was design with an ambition to replace their internal RPC system but it probably failed at that.
They have a new system called Snap although judging from the paper I don't think it can completely replace TCP: https://research.google/pubs/pub48630/ My understanding is that Snap enables new use cases including moving functionality previously done via RPCs to RDMA-like one-sided operations. I think it is complement to RPCs but does not replace it.