I've never really understood this argument. Yes, it's a disadvantage to "have no...

paulhauggis · on Dec 18, 2011

"You can even log the SQL queries it generates, and drop down to custom SQL queries if necessary. If you're paying attention to what's going on "underneath", what is the problem with using abstractions that remove some boilerplate?"

In most of the frameworks I've seen, to do so means to hack around the base framework. It's also much faster to just write the correct query in the first place rather than logging every query from the framework and trying to change them for each situation that requires optimization.

murz · on Dec 18, 2011

> In most of the frameworks I've seen, to do so means to hack around the base framework.

ActiveRecord [1][2] and Django [3][4] both feature logging sql and dropping down to custom sql out of the box, so I'm not sure which frameworks you had to hack around to accomplish that but those are the two examples you used in your OP.

> It's also much faster to just write the correct query in the first place rather than logging every query from the framework and trying to change them for each situation that requires optimization.

In my experience, 90% or more of the generated queries are fine, there's just a few that need rewriting. And it's not like it takes a lot of cognitive overhead to glance at a console to see what queries are being executed while you're testing that new feature you just added. For me, it's certainly less overhead than writing all of them myself.

[1] http://weblog.jamisbuck.org/2007/1/8/watching-activerecord-d...

[2] http://api.rubyonrails.org/classes/ActiveRecord/Base.html#me...

[3] https://docs.djangoproject.com/en/dev/topics/logging/#django...

[4] https://docs.djangoproject.com/en/dev/topics/db/sql/#django....

flomo · on Dec 18, 2011

Well, RoR is living on Mysql defaults turd planet and therefore thinks Foreign Keys are a DRY thing and not a database optimization. So I'm pretty sure anybody with more than a trivial Rails app is manually indexing those columns. Either that or they are spending way more on their database than they should be.

einhverfr · on Dec 18, 2011

I think the fundamental problem is that Rails is based on active hostility towards database engineering. It's the same thing that MySQL did early on. It works very well in some cases but has a serious cost attached that few people want to talk about, namely the fact that things suddenly become problematic when you want to use the RDBMS to effectively share data between diverse applications. The thing is that once you make this tradeoff, I wonder whether it is really that helpful to stick with an RDBMS at all, or whether NoSQL is a better option.

dasil003 · on Dec 18, 2011

Good point, however at least ActiveRecord doesn't make the mistake of trying to utterly hide SQL from the user. Since moving to arel in Rails 3.0 things are very clean, logical, and not trying to do to much. It's not particularly hostile to serious SQL developers even if it doesn't have proper language-level tooling for advanced SQL.

I think it's still a clear win over NoSQL where a one-size-fits-all solution is simply impossible and would send Rails off into the weeds.

einhverfr · on Dec 18, 2011

But you have an inherent tradeoff in ORM-land., You can either essentially build your database as an object store around your ORM or you can build your database and do real SQL coding. But at that point why use an ORM like Active Record at all?

masklinn · on Dec 18, 2011

> But at that point why use an ORM like Active Record at all?

Automatic type conversion and packaging of "rows" into hashes or objects, so you don't have to handle date parsing or integer conversions by hand and can call your utility methods or return your instances to whoever needs them.

That's basically all Rails's `ActiveRecord::Base.find_by_sql` or Django's `Manager.raw` (linked to by murz) do.

That's also the core/most basic layer of SQLAlchemy, and it's perfectly possible to use only that: SQLAlchemy has an Expression Language which is a direct translation of SQL into Python expressions[0] and an ORM built on top of that[1], and using just the expression language is perfectly OK if that's what you want.

[0] http://www.sqlalchemy.org/docs/core/tutorial.html

[1] http://www.sqlalchemy.org/docs/orm/tutorial.html

einhverfr · on Dec 18, 2011

Replying to Masklinn below:

> But at that point why use an ORM like Active Record at all?

Automatic type conversion and packaging of "rows" into hashes or objects, so you don't have to handle date parsing or integer conversions by hand and can call your utility methods or return your instances to whoever needs them.

But this isn't really what I am getting at.

Once you go down the ORM route and that means sacrificing database engineering for use of the ORM (and thus largely making your database into a single application part of the stack), then what benefit do you get from using an RDBMS? Wouldn't NoSQL be better?

And if all you are getting is type conversion and hash<->relation conversion, why not use NoSQL?

The issue is a specific one about a design tradeoff. You can do real db engineering (as high normalization of data as possible based on inherent functional dependencies of the data itself) or you can build your db schema around the object model of your app. If you do the latter, I am not sure you have a compelling reason to use an RDBMS, and if you don't have a compelling reason to use an RDBMS, you don't have a compelling reason to use an ORM at all. After all if I use MongoDB and simply serialize my objects into JSON or the like, then I get the same benefit but with better performance and less opacity.

It seems to me that for an ORM to work the relational structure must be designed around what the ORM expects. Applying an ORM to something like a 5NF database designed for a very different application strikes me as fundamentally problematic.

Is it just that you have the ability to hopefully put in some ad hoc reporting functionality later? What is the benefit?

dasil003 · on Dec 19, 2011

Note: If you click the link button you can reply directly even to new posts.

> Once you go down the ORM route and that means sacrificing database engineering for use of the ORM

This is where you're wrong. This is an all-or-nothing argument that isn't based in reality. The reality is you make choices at every level, and an ORM like ActiveRecord doesn't do much to prevent you from using the full database feature set at your disposal. Granted, if you do everything with stored procedures and foreign keys then you won't be able to use all of ActiveRecord's features, but you really can do things any way you want.

> then what benefit do you get from using an RDBMS? Wouldn't NoSQL be better?

Now assuming you lean more in the ActiveRecord direction and don't do much heavy database engineering, why no use NoSQL? Well, take your pick from the usual reasons. RDBMS systems are mature, performant, extremely rich for ad-hoc querying and, easy to administer, etc, etc. Just because ActiveRecord only explicitly uses 20% of the features of the typical RDBMS doesn't mean those features aren't extremely useful per se.

The way I look at it is that using an RDBMS for almost any new project is a smart hedge, because an RDBMS is the most type of flexible data store. It will allow you to pivot more easily compared to a NoSQL store where things may be extremely scalable in the dimension that you foresee, but if you need to pivot your data model you are in for a massive rewrite. People complain about adding columns and indices to SQL databases vs the "agility" of just adding keys to Mongo or whatever, but honestly it's just a little busy work most of the time, it doesn't really hurt until your data gets huge, at which point you are going to be running into unanticipated bottlenecks anyway.

einhverfr · on Dec 19, 2011

I guess the way I look at it is this:

The relational model excels at exactly one thing: presenting your data in multiple ways for multiple uses. As you start to sacrifice this, the RDBMS starts becoming less useful, fast. A good case in point is something like an LDAP server. Not really a good use case for relational database, though you can export the data into a relational system or even store your master copy there and push out changes. But here you have a protocol that lends itself reasonably well to integration anyway and ad hoc reporting really isn't likely to be an issue.

The way I look at it is that using an RDBMS for almost any new project is a smart hedge, because an RDBMS is the most type of flexible data store. It will allow you to pivot more easily compared to a NoSQL store where things may be extremely scalable in the dimension that you foresee, but if you need to pivot your data model you are in for a massive rewrite. People complain about adding columns and indices to SQL databases vs the "agility" of just adding keys to Mongo or whatever, but honestly it's just a little busy work most of the time, it doesn't really hurt until your data gets huge, at which point you are going to be running into unanticipated bottlenecks anyway.

I am not sure either the cost of this is as great as people suggest, and adding/dropping columns on large tables in PostgreSQL is fast. Rather I think the fundamental issue is that a relational model requires more engineering to leverage than agile folks are usually comfortable with. I am referring of course to the N word (Normalization). The thing about normalization is that the goal is to organize your data based on functional dependencies (relational algebra-wise) within the data model, not functional dependencies (execution-wise) within the application. This means stopping and looking at the data and asking "which values are dependent on which values?" and normalizing that way. The thing is that this means larger numbers of joins and that tables are unlikely to represent the data model of the application directly.

The advantage is then that you can then take your data and use it in a few other applications. The disadvantage is it is very hard to do this in an agile way.

So it seems to me that where RDBMS's are concerned, the smart thing to do is put engineering effort into the database and then try to ensure that agile applications can be built on top of it.

The way I look at it is that using an RDBMS for almost any new project is a smart hedge, because an RDBMS is the most type of flexible data store.

But this takes a willingness to ensure your data is modelled in the db in an application-neutral way (or at least an approximation of application neutrality).

dasil003 · on Dec 21, 2011

I would argue that you're painting "agile" with one brush, except I hate how it's become yet another meaningless buzzword wielded by incompetent management, so I won't.

rimantas · on Dec 18, 2011

Also next version of RoR is likely to have auto EXPLAIN for slow queries.