Hacker Newsnew | past | comments | ask | show | jobs | submit | hazz99's commentslogin

I’m sure this work is very impressive, but these QPS numbers don’t seem particularly high to me, at least compared to existing horizontally scalable service patterns. Why is it hard for the kube control plane to hit these numbers?

For instance, postgres can hit this sort of QPS easily, afaik. It’s not distributed, but I’m sure Vitess could do something similar. The query patterns don’t seem particularly complex either.

Not trying to be reductive - I’m sure there’s some complexity here I’m missing!


I am extremely Not A Database Person but I understand that the rationale for Kubernetes adopting etcd as its preferred data store was more about its distributed consistency features and less about query throughput. etcd is slower cause it's doing RAFT things and flushing stuff to disk.

Projects like kine allow K8s users to swap sqlite or postgres in place of etcd which (I assume, please correct me otherwise) would deliver better throughput since those backends don't need to perform consenus operations.

https://github.com/k3s-io/kine


You might not be a database person, but you’re spot on.

A well managed HA postgresql (active/passive) is going to run circles around etcd for kube controlplane operations.

The caveat here is increased risk of downtime, and a much higher management overhead, which is why its not the default.


There are also distributed databases that use RAFT but can still scale while delivering distributed consensus don’t is not a challenge that can’t be solved. For example, TiDB handles millions of QPS while delivering ACID transactions, e.g. https://vivekbansal.substack.com/p/system-design-study-how-f...


GKE uses Spanner as an etcd replacement.


But, and I'm honestly asking, you as a GKE user don't have to manage that spanner instance, right? So, you should in theory be able to just throw higher loads at it and spanner should be autoscaling?


Yes, from the article:

> To support the cluster’s massive scale, we relied on a proprietary key-value store based on Google’s Spanner distributed database... We didn’t witness any bottlenecks with respect to the new storage system and it showed no signs of it not being able to support higher scales.


Yeah, I guess my question was a bit more nuanced. What I was curious about was if they were fully relying on normal autoscaling that any customer would get or were they manually scaling the spanner instance in anticipation of the load? I guess it's unlikely we're going to get that level of detailed info from this article though.


it's not really bottlenecked by the store but by the calculations performed on each pod schedule/creation.

It's basically "take global state of node load and capacity, pick where to schedule it", and I'd imagine probably not running in parallel coz that would be far harder to manage.


No a k8s dev, but I feel like this is the answer. K8s isn't usually just scheduling pods round robin or at random. There's a lot of state to evaluate, and the problem of scheduling pods becomes an NP-hard problem similar to bin packing problem. I doubt the implementation tries to be optimal here, but it feels a computationally heavy problem.


In what way is it NP-hard? From what I can gather it just eliminates nodes where the pod wouldn't be allowed to run, calculates a score for each and then randomly selects one of the nodes that has the lowest score, so trivially parallelizable.


I think filtering and scoring fall under a heuristics based approach to address NP-hardness?

Binpacking seems to be a well-defined NP-hard problem: https://en.wikipedia.org/wiki/Bin_packing_problem


That's greedy


The k8s scheduler lets you tweak how many nodes to look at when scheduling a pod (percentage of nodes to score) so you can change how big “global state” is according to the scheduler algorithm.


It says in the blog that they require 13,000 queries per second to update lease objects, not that 13,000 is the total for all queries. I don't know why they cite that instead of total, but etcd's normal performance testing indicates it can handle at least 50,000 writes per second and 180,000 reads: https://etcd.io/docs/v3.6/op-guide/performance/. So, without them saying what the real number is, I'm going to guess their reads and writes outside of lease updates are at least much larger than those numbers.


Once you're past the fundamentals, if find yourself interested in high-performance networking, I recommend looking into userspace networking and NIC device drivers. The Intel 82599ES has a freely available (and readable!) data sheet, DPDK has a great book, fd.io has published absolutely insane benchmarks, ixy [1] has a wonderful paper and repo. It's a great way to go beyond the basics of networking and CPU performance. It's even more approachable today with XDP – you don't need to write device-specific code.

[1] https://github.com/emmericp/ixy


And on the other end of the advanced spectrum, any recommendations for learning datacenter networking? Eg similar to the topics in Cisco’s datacenter certifications: https://learningnetwork.cisco.com/s/ccie-data-center-exam-to...


Unfortunately I don’t have any recommendations to give, my experience starts and stops at application development. Though I would love to spend some time in a datacenter!

Potentially have a look at Infiniband and Clos/fat tree networks?

My more generic recommendation would be to explore semantic scholar for impactful/cited papers, look for some meta analyses, and just dig through multiple layers of references till you hit the fundamentals (typically things published in the 80s for a lot of CS topics).


Do you have any career advice for someone deeply interested in breaking into high performance GPU programming? I find resources like these, and projects like OpenAIs Triton compiler or MIMD-on-GPU so incredibly interesting.

But I have no idea who employs those skills! Beyond scientific HPC groups or ML research teams anyway - I doubt they’d accept someone without a PhD.

My current gameplan is getting through “Professional CUDA C programming” and various computer architecture textbooks, and seeing if that’s enough.


Given that CUDA main focus is C++ since CUDA 3.0, ignoring the other PTX sources for now, not sure if that 2014 book is the right approach to learn CUDA.


Can you elaborate a bit on how C++ affects the programming model? Isn't CUDA just a variant of C? I presume it is not the goal to run standard C++? Also as I understand it PTX is an IR so not sure why C/C++ can be compared?


Not at all, unless we are speaking of CUDA until version 3.0.

CUDA is a polyglot programming model for NVidia GPU, with first party support for C, C++, Fortran, and anything else that can target PTX bytecode.

PTX allows for many other languages with toolchains to also target CUDA in some form, with .NET, Java, Haskell, Julia, Python having some kind of NVidia sponsored implementations.

https://developer.nvidia.com/language-solutions

While originally CUDA had its own hardware memory model, NVidia decided to make it follow C++11 memory semantics and went through a decade of hardware redesign to make it possible.

- CppCon 2017: Olivier Giroux "Designing (New) C++ Hardware”

https://www.youtube.com/watch?v=86seb-iZCnI

- The CUDA C++ Standard Library

https://www.youtube.com/watch?v=g78qaeBrPl8

It is also driving many of the use cases in parallel programming for C++

- Future of Standard and CUDA C++

https://www.youtube.com/watch?v=wtsnoUDFmWw

You will only find brief mentions of C here,

https://developer.nvidia.com/hpc-compilers

This is why OpenCL kind of lost the race, with it focused too much in its C dialect, only going polyglot when it was too late for the research community to care.


I disagree, “req/res body cookies that’s it” is not the underlying system. It’s an abstraction over the TCP/IP and HTTP stack.

What happens when you need to disable nagles, avoid copies of the request body, use websockets, gRPC, etc? You’d need to pray that the framework gives you an escape hatch.


I might be doing it wrong, but I just approach those as new pipes to be put onto the system? they're after all basically different ways of transmitting data from point a > point b.

Why would you choose a framework that didn't provide a escape hatch for your use case is the real question tho.

TIL: Nagles/TCP


Is there a way for me to subscribe to updates on this? I’d love that!


It's very early stage, so I expect I'll finish v1.0 in about half a year. I have to work on my Msc project as well. I'll post it on linkedin, so you can follow me there. https://www.linkedin.com/in/petr-kube%C5%A1-a9a48212a/


Can’t you just swipe down from the top of the screen and press the Bluetooth icon, or am I missing something?


That does not do what you think it does: that’s the button to disconnect from currently connected devices, not to turn off radios. Similarly, the airplane mode button above it will maintain Bluetooth.


Turning Airplane Mode on and then tapping the Bluetooth icon will fully disable Bluetooth (gray background, not white). You can also just disable it in Settings. Apple’s argument, as I understand it, is that Bluetooth and WiFi don’t actually consume much battery, but non-technical users would habitually disable them to save battery and then complain that Location Services didn’t work well. Hiding the setting but keeping an essentially useless toggle prevented those non-technical users from making their device function worse while still letting them feel like they were saving battery life.


Yeah but they forget about users who simple want to disable yet another tracking vector on their device.

To add insult to injury, when you automate WiFi and Bluetooth to turn off when you leave your home you need to manually affirm this action via a notification every time.

Super annoying.


I created a Shortcut for that, essentially a button on my homescreen I can press which actually disables bluetooth and wifi. And when my phone is connected to wifi, mobile data gets disabled.

It's not perfect, but at least I don't have to open the settings app each time.


Why do you do that though?


If I turn off bluetooth and wifi I want them to be off.


You should share the design principles!


And make a generic framework /s


What translation is this? I love the writing!


They sell an API product, of which documentation is an absolutely core, user-facing part of their product. It doesn’t make sense not to own something that differentiates your products value proposition.


Is that not accurate? There’s only so many hours in the day, and I have to split them between competing priorities.

If I don’t have time for something, I don’t have the remaining hours to spend on it, given the time I’ll spend doing other things.


You have no choice but to spend all your 24 hours in the day. You can't just skip over 7--9 AM and then have two hours in your back pocket for when you are short on time.

So no, you never just have time you're not using anyway. You're always using all of your time. On what is a matter of priorities.

However, if you would spend 19--21 watching a TV series, or twiddling your thumbs, you can choose to take that time to do something else.


"take" is weird wording too. it's your time, you already have it. it's just how you use it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: