Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Forgive my ignorance but why isn't SCTP more frequently used in DCs? I know it misbehaves with home routers etc. but shouldn't be a factor here.


I suspect there are a couple of contributors.

TCP is prevalent on the internet, so you need a fairly strong motivation and benefits to adopt a second protocol. A lot of engineering also doesn't get underlying networking, so one of the successes of TCP is it's a file descriptor that you either write to or read from and magic makes it come out the other side. I've seen tech leadership on networking centric products know nothing more than you read and write and magic makes the data appear on the other side. Even on implementations that use SCTP, I've seen products that only using a single stream and mark every message as requiring in order delivery. So it was effectively what TCP offers using the SCTP protocol.

At the time TCP was also far higher performance than SCTP. This wasn't so much a protocol thing, but because TCP was getting more engineering attention, it got a lot more scheduler optimization, kernel optimizations, and hardware offload support. So in many ways I think TCP scaled better due to these optimizations, which work both on the internet and internally. And then for multi-path, most data centers didn't get truly isolated networks. So if I'm running a mixture of TCP and SCTP, I still need L2 failover everywhere, which means my multi-homed SCTP connection isn't actually path diverse. And then where beneficial over the internet, there are a few success cases of using multipath TCP extensions.

SCTP is still used quite a bit in the telco networks, but due to the above, it was quite a waste of time.


How does your theory that the failure of SCTP is because a) people don’t understand networking and b) tcp eats up all the development oxygen explain QUIC?

I’m also not sure what you mean but DCs within a major cloud provider are majority AFAIK running truly isolated networks interconnected directly with fiber.

If you haven’t yet, I would recommend reading the very original QUIC paper. It was extremely astute and showed quite a deep understanding of what the problems were with TCP done by network engineers who really knew their shit (I got to interact with some of them when I was at Google). They talk about the failures of SCTP on technical levels and non-technical headwinds that weren’t accounted for like ossification. To my knowledge QUIC is SCTP 2.0 - it provides much of the same features and in a way that could actually leave the lab.


> How does your theory that the failure of SCTP is because a) people don’t understand networking and b) tcp eats up all the development oxygen explain QUIC?

I think this is the motivation side of the argument. SCTP doesn't provide any advantage internally for most use cases, as I outlined my thoughts on the basis above. QUIC on the other hand is an attempt to solve a completely different set of problems, and is getting the engineering dollars to deploy because where latency and internet comes into play, there is a strong motivation to be faster. And it also becomes more of an upgrade path.

> I’m also not sure what you mean but DCs within a major cloud provider are majority AFAIK running truly isolated networks interconnected directly with fiber.

Sorry about being unclear, I typed the out pretty quickly. One of the main factors that drove Telecom to create and adopt SCTP, is the way telecoms like to interconnect with eachother. For signaling traffic (message like I want to setup a new phone call), the telco's like to set up multiple independent connections. So with SCTP, they want multi-path support, where each server advertises a list of IP addresses for the connection. So between two telco's, you have a dedicated non-internet connection A, and a diverse network B. Equipment that communicates on these networks is then physically plugged into both networks. This creates a need for a protocol that understand this, and when a failure occurs in transmitting on the A network, retransmission occurs on the B network. The idea is these are diverse networks, nothin can really interact with both at the same time (that's the theory, in practice there be stories).

Where this maps to data center networks, is to my knowledge most data center networks are not designed into an A and B network for diversity. Where you would have to use multipath TCP or SCTP. And if you want to use both together, you're going to design the network to support all the failovers and redundancy to deliver TCP.

So that's what I was trying to get at, where there is a big adoption driver and protocol complexity is on the multi-path support, which to fully utilize requires additional engineering effort in the data center.


Same reason Homa isn't used: software isn't written for it.

With SCTP there's also a significant performance impact because many drivers for the protocol are far from optimised, because very few applications use it, because of its performance implications, because very few programs use it, etc. etc.

There's also firewall issues: big firewall vendors just don't play nice with anything that's not a variant of HTTP(S). You still need some kind of firewall in a datacenter and it'd be foolish to set up two different ones for internal and external networking. Protocol ossification is real and if you use any external piece of firewall kit, you're sure to run into problems if you try to use "novel" protocols like SCTP. Hell, you'll be lucky to get good IPv6 support.

You can write your own access control if you want but that's often perceived as more expensive than buying a box, especially if the box companies find their way into a meeting with management.

Lastly, there's education. A shocking amount of developers have no idea about how networking works. They probably know there are protocols like UDP and TCP but their role and inner workings are often glazed over in my experience. Practical networking courses seem to treat the network as some kind of black box where bytes and IP addresses go in and response data comes out. If developers do know their basic networking, that information is often out of date; people don't seem to realise how often TCP gets tweaked to behave slightly differently to improve performance. Ask your average dev something about IPv6 and I doubt they'll know much more than "it's IPv4 with more bits" because networking simply doesn't come up that often.

In the end, it comes down to tradeoffs, experience, and decisions. Feel free to write SCTP code for your servers products where you can, the protocol definitely solves many issues people run into in TCP, but you'll probably have to defend your use of something unfamiliar to many developers every step along the way. The same is true for protocols like QUIC (outside the HTTP(S) environment) which tries to solve a whole lot of layer 3 to layer 5 problems in a single protocol that's designed to play nicely with shitty middleware boxes by its basis in UDP.


Software -- legacy software, which is always all software currently in use, which is an enormous code base.

It would be easier to have a drop-in replacement for TCP that, whenever it can work, connect() will use it, and which listen()/accept() will work with as well as TCP. Then all apps that can use TCP could use the new transport transparently.

Basically, we need a TCP++ that works with existing APIs but which can also provide new functionality via new APIs.

Of course, backwards-compatibility is very limiting, which sucks.

We can also have new transports that have new APIs, but we need a better TCP for backwards compatibility because legacy is forever.

Also, the focus on RPC is cool because any protocol where you typically have a library doing the I/O -and not too many such libraries- is amenable to using the new thing, and that includes HTTP (which isn't an RPC). But TFA really needs to mention HTTP in the same breath as RPC, because -sadly- way too many readers will just close the tab as soon as they see "RPC" and not "HTTP".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: