It's a brand new newsletter from two community menbers who also happen to be authors of two O'Reilly monitoring books, the creator of Monitorama, and a long-time maintainer of the Graphite project. You don't have to believe me but we're going to do our best to release a top-notch collection of monitoring-related stories. We don't have an archive yet but we intend to.
What? I'm not saying they are jerks. I'm pointing out that it is harder for existing projects to track the changes in the market place. It is hard for a new open source project to launch and build up enough of a community to be viable.
Except that while Graphite is a "real web application", it's generally internal-facing only. As such, I never looked at it as needing a true "production-ready" RDBMS with real concurrency. And to be honest, we got by just fine with it in production at Heroku and GitHub, at heavy volume, running in a single server on SQLite. It never bit us, and all things considered, was a reasonable choice.
That said, knowing what I know now, I would've never done that. ;-)
I disagree because it is internal facing lol. If the software is capable of running on a true production-ready DBMS, there is no point why it can't be done already.
If SQLite is cheap to use because it's a single file, well, a sudo apt-get install PostgreSQL and then a few commands to set up the database and user password won't take you longer than 10 minutes. That's what I am arguing. There is no reason why we would use SQLite for heavy loading application such as logging. And I think we can now agree that's indeed true and I think we should promote people to use production-ready DBMS even during dev and testing, because those are the one you are going to use in production :)
Nobody ever said it /can't/ be done. I would never have argued against using a real RDBMS. But I've often been guilty of telling folks that SQLite was "good enough".
Installing PostgreSQL as you described is not "good enough". If you're going to bother with that, you need to understand the database enough to tune it properly, setup backups, test restores, etc. There are more considerations than simply installing it from apt-get and then forgetting about it.
That's not at all a dumb comment. As I alluded to in the post, I think it's important that we understand how these systems determine what is - or isn't - an abnormality or fault. Unfortunately, that often means revealing their "secret sauce" and risk exposing their product differentiation. It's going to be interesting to see how these products earn our trust.
Absolutely - this is one of the reasons that we made Kale open sourced so that people can see what we consider an anomaly, and adapt for their own use cases if needed. If your anomaly detection contains secret sauce, it'll be very hard for people to have confidence in it.
Yes, it certainly /can/ scale. In fact, a properly modularized monitoring solution /should/ be more capable of scaling if it was designed with this in mind. This is certainly the approach we've taken with our monitoring and trending components at Heroku. We don't have anywhere near 500K nodes, but the principles scale.
@josephruscio - I was >this< close to going back and giving a quick mention to Librato Metrics. You guys definitely "get" what I'm talking about. I like that you provide well-defined interfaces for easily getting data into and out of your application. You focus on trending and let other (better?) software handle the other stuff.
https://github.com/obfuscurity/synthesize/releases/tag/v3.0....