It's hard to tell how meaningful the reviews are. I have used AWS, GCP, DigialOcean, and Linode throughout my career. Every single one of these, through no fault of myself or my team, messed up and caused downtime. Like, you can get most SRE types in a room to laugh if you blurt out "us-east-1", because it's known to be so unreliable. And yet, it's where every Fortune 500 puts every service; we laugh about the reliability and it's literally powering the economy just fine.
So yes, a lot of people on HN complain about fly's reliability. fly posts to HN a lot and gives them the opportunity. Is it actually meaningful compared to the alternatives? It's very hard to tell.
First: this is 100% a "live by the sword, die by the sword" situation for us. We're as aware as anybody about our weird HN darling status (this is a post from two months ago, about an announcement from many months ago, that spent like 12 hours plastered to the front page; we have no idea why it hit today, and it actually stepped on another thing we wanted to post today so don't think we secretly orchestrated any of this!). We've allowed ourselves to be ultra-visible here, and threads like this are natural consequence.
Moreover: a lot of this criticism is well warranted! I can cough up a litany of mitigating factors (the guy who stored his database in ephemeral instance storage instead of a volume, for instance), but I mean, come on. The single most highly upvoted and trafficked thing we've ever written was a post a year ago owning up to reliability issues on the platform. People have definitely had issues!
A fun cop-out answer here is to note all the times people compare us to AWS or Cloudflare, as if we were a hyperscaler public cloud. More fun still is to search HN for stories about us-east-1. We certainly do that to self-sooth internally! And: also? If your only consideration for picking a place to host an application is platform reliability? You're hosting on AWS anyways. But it's still a cop-out.
So I guess I'd sum all this up as: we've picked a hard problem to work on. Things are mathematically guaranteed to go wrong even if we're perfect, and we are not that. People should take criticisms of us on these threads seriously. We do. This is a tough crowd (the threads, if not the vote scores on our blog post) and there's value in that. Over the last year, and through this upcoming year, staffing for infra reliability has been the single biggest driver of hiring at Fly.io, I think that's the right call, and I think the fact that we occasionally get mauled on threads is part of what enabled us to make that call.
(Ordinarily I'd shut up about this stuff and let the thread die out itself, but some dearly loved user of ours took a stand and said they'd never had any problems on us, which: you can imagine the "ohhhhh nooooooo" montage that took place in my brain when I read that someone had essentially dared the thread to come up with times when we'd sucked for some user, so I guess all bets are off. Go easy on Xe, though: they really are just an ultra-helpful uncynical person, and kind of walked into a buzzsaw here).
I also don't know why HN is so upset about people willing to help out in the threads. The way I see it is, if you talk about your product on HN, inevitably someone will remember they have a support inquiry while HN is open, and ask it there instead of over email. Since employees are probably reading HN, they are naturally going to want to answer or say they escalated there. I don't think it's some sort of scam, just what any reasonable person would do.
It's become a YC cliche, that the way to get support for any issue is to get a complaint upvoted to the top of a thread. People used to talk about "Collison installs", which are real-use product demos that are so slick your company founder (in this case Stripe's 'pc) can just wander around installing your product for people to evangelize it; there should be another Collison term for decisively resolving customer support issues by having the founder drop into a thread, and I think that's the vibe people are reacting to here.
ok possibly not alone, maybe the issues happened before I started using them extensively. I've had ~no downtime that affects me in 7 months.
I do wish they had some features I need, but their support and responses are top notch. And I've lost much less hair and time than I would going full-blown AWS or another cloud provider.
To be fair most hosting providers come with plenty of public complaints about downtime. The big ones do way better, the best one is AWS, then GC and last Azure. They cost stupid money though.
Digital ocean has been terrible for me, some regions just go down every month and I lose thousands of requests, increasing my churn rate.
Fly.io had tons of weird issues but it got better in the last months. It's still very incomplete in terms of functionality and figuring out how to deploy the first time is a massive pain.
My plan is to add Hetzner and load balance with bunnycdn across DO and H
Actually here is a good example: Cloudflare. Sure people complain a ton about privacy but I haven't seen a single complaint about the reliability of Cloudflare Workers or similar product in the dozens of threads I've seen on HN
this is what I thought, until once I spent two days to publish a new, trivial code change to my Fly.io hosted API — it just wouldn't update! And every time I tried to re-publish it'd give me a slightly different error.
When it works, it's brilliant. The problem is that it hasn't worked too well in the last few months.