Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a SRE and encountered this recently. To prevent DDoS, there is a buffer setting on the kernel that will limit the number of pings (a few settings actually). So if you have a group of machines that all ping a single destination at once, it's very possible to have some that fail to get a reply.


It's for reasons like this that ping is one of the worst protocols to use for aliveness.

Even worse is I've had completely dead Linux boxes that will gladly respond to ping and nothing else.


Oh, that's nasty. How long did it take you to troubleshoot that?


Relatively speaking, it wasn't that bad. It took a few weeks of getting trouble tickets with no root cause, and a bit of googling. But management wasn't okay with fixing the root cause, instead they just increased the timeout/retry window.


Wow. That's a classic. We were quite motivated because we were the ones that got the automated alerts. I still see them in my nightmares: "chopper is down". The machine was called chopper, I'll never forget, it's been close to 30 years. My buddy Jasper and me spent multiple nights trying to track it and when we finally found it we still couldn't believe that that was it. But a simple swap was proof.


Did someone yell for you to "Get to the choppa! Do it! Now!!"?? Please say that's not been wasted!


I think we were past the point of humor during that particular episode but there was a reason it was named like it was.


if it wasn't an Arnie reference, could it have been a Stand By Me reference, "Chopper, sick balls"? or Eric Bana's Chopper: "if you keep stabbing me, I'm gonna die"?

clearly, i'm the type where everything is a movie reference, or it's a missed opportunity


whose chopper is this?


It's Zed's




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: