Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When it comes to distributed services, unwrap is likely the behavior you do want, because the service is being run by a scheduler that detects failures, and a panic would be tied to a metric and page an sre. My read of the situation is that they were overwhelmed by their dashboards which made it harder to identify the root cause, but that is a common situation for these kind of events. I'm pretty sure that this exact sequence of events will never happen again to them because they will adjust their observability to be clearer, should this ever happen again.

For libraries and cli tools, you don't want panics except for things that are broken beyond any recovery. A cli tool should not panic directly, instead emitting human readable errors.



Yeah. Its also weird that people online are fixating on unwrap - as if changing or fixing unwrap's semantics would have helped cloudflare here.

The best way to think about unwrap is that its like an assert. If an assert trips, the answer isn't to remove the assert and just hope for the best. Asserts are almost never the problem. The problem is whatever happened right before the assert. Ie, the buggy codepath that generated the erroneous state in the first place.

In cloudflare's case, the bug was that their code required that a database query returned less than 200 results. Only the database returned more results than that. This isn't a problem with unwrap. It was these two mutually incompatible pieces of behaviour colliding in code. I don't see why rust has anything to do with this at all. Sloppy programmers can make a mess in any language. Rust is no exception.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: