On a quick read of this the move seems pretty to have a pretty considerable impact on me, while I appreciate the free service it sounds like it’s been a trap. I’ve had gsuite for a while, primarily using it to host a custom domain for email for my family. Effectively I view my use of these services as mostly a free (gmail) user, just with the added benefit of a custom domain. I’m not sure $6/month/user is really worth it just for a custom domain; but I don’t really think I have a choice as the accounts also represent google accounts with other services hanging off them (digital purchases, other subscriptions such as YouTube premium, additional storage, etc).
Either I need to continue to pay to not loose all the items I’ve accumulated in the google world on this account, or after this experience choose to migrate my whole family somewhere else and deal with whatever other complexities that means, but cement my final departure from Google’s products, including finding alternatives for the other products I currently subscribe for from Google
While it's nice to get paged, and look at every 5xx error; it doesn't really scale all that well once you get past a certain point, particularly if your application is gracefully degrading.
That said, I love the wisdom in your comment that you find all sorts of super rare bugs, or conditions that could seriously effect performance, or availability if they become more common (which they often do). Past a point, I've found that an approach which works well is to encourage engineers/operators to drive by metrics, and pay close attention over time to p100's (max), as you've suggested with your 500 errors. Lots of goodies can be hidden behind them, just like you've found with the 500 errors.
Just a thought... If your storage layer has support for taking a consistent snapshot of your file system then you might be able to use this to get a backup.
You would get a copy of your database that you would need to run log-replay recovery on but after that it should be all good.
Plenty of other machines have Amazing IO in this day and age now. It is a fairly old notion now that Mainframes were much better at IO than other machines (this /was/ true).
That said, they do deal with IO pretty well. They have dedicated offload processors for transferring data, which means for the same (IO intensive) workload, the CP utilisation of a zSeries machine would be much less than that of an x86, pSeries (or other) machine. And in terms of the disk being a bottle neck... If you're pushing to a disk array that has 192GB (pretty standard) of write-back cache... then no, the disks are not a large concern.
But... At the end of the day, does this really weigh up to the cost of Big Iron, or for that matter the additional licensing of software on top of it?... Not really.. is my answer.
try dyno-blitzer https://github.com/pcapr/dyno-blitzer which scales up dynos, runs a load test and tells you how many dynos you'll need to support x concurrent users.
That's not what I mean. That will make it run with concurrent threads per request. However, I'm fairly certain it isn't threadsafe.
Somewhere around 3.0.6 I made a simple jruby rails app and a sinatra app. They were both a thin REST layer over data in mongodb (No AR). I did load testing and the sinatra app ran fine, the rails app would fail on about 7% of requests with seemingly random stack traces. When I reran the tests with config.threadsafe! turned OFF it never failed.
Also, on Bob Lee's OSCON talk he mentioned some unthreadsafe code in AR that he found.
I've had good experiences with several jruby on rails apps under tomcat using warbler and they all seem to perform fine with config.threadsafe! enabled. You should probably investigate those stacktraces, maybe they were caused by unthreadsafe gems or application code?
Yeah, there certainly are a few more tricks with regards to scaling RDBMS than the author covered. Depending on what your demands are there are different techniques and protocols which best suite you and can go very far to solving your problem. But with that in mind there may be better ways to solve your problem, and we should not forget about those.
If You start sharding you may get write gains, but you have to work very hard to keep things consistent (depending) and you may have to duplicate shards to make them highly available ($$$). Oh - and later down the track your schema might change in ways which your sharding scheme is just not flexible enough to deal with and depending on who you are that may be too much of a risk.
Besides the issue is really with availability, consistency and performance. It is very hard to scale all three of these together and even your cashcow solutions will hit their limits (although some of their limits are quite high :))
Either I need to continue to pay to not loose all the items I’ve accumulated in the google world on this account, or after this experience choose to migrate my whole family somewhere else and deal with whatever other complexities that means, but cement my final departure from Google’s products, including finding alternatives for the other products I currently subscribe for from Google