Hibernate vs. Rails: The Persistence Showdown
Of particular interest is ORM (Object Relational Mapping) tool, ActiveRecord. Since choosing a technology always involves opportunity costs of some kind, I have written this article to compare and contrast with another popular ORM tool, Hibernate. It summarizes what I've learned about Rails, mainly by stacking it up against Hibernate.
And the wires were all a buzz about Rails...
Much like a few other java folks, such as Bruce Tate and David Geary, I have been taking a look at a new web framework Rails. Of particular interest to me is its ORM (Object Relational Mapping) tool, ActiveRecord. Since choosing a technology always involves opportunity costs of some kind, I have written this article to compare and contrast with another popular ORM tool, Hibernate. It summarizes what I've learned about Rails, mainly by stacking it up against Hibernate, a technology I'm very familiar with.
The Back Story
One of the hotter topics on the minds of developers of late has been the rise in discussion about a new framework, Ruby on Rails. Rails is a MVC web framework, conceptually similar to Struts or Webwork, but written in the scripting language Ruby, instead of Java. Beyond just a web framework, it offers a number of integrated technologies, like built in code generation and its own ORM (Object Relational Mapping) tool, the ActiveRecord. By combining a number of tools into an single elegant integrated framework, it has led to claims that Rails development is "at least ten times faster with Rails than you could with a typical Java framework".
Now the claims and counterclaims might lead one to believe that Rails either the reincarnated Savior itself or alternatively a foul creature from the 66th layer of the Abyss. Which one you believe probably depends on your favorite programming language starts with J or an R.
What this isn't...This is not a Hibernate is l33t, Rails suxxor article. Rails appears to have some interesting concepts, but I'm very suspicious when people start throwing around 10x productivity figures. After a first pass, the framework appears to be very fast out of the gate especially if you are building a pure CRUD implementation model web database. However, its hard to see if the development velocity remains that high though.
Overview
Based on the research I've done so far, Rails seems very grounded in a single table/single object mind set. This makes it very well suited for simple models, with a few associations. The assumptions it make about database are reasonable and simple to deal with for a green field project. Hibernate shines as the object model gets more complicated, or against existing databases. Having been around a while longer, it is more mature and has a lot more features. But lets get into the weeds and look at some of the differences. Here is a list of concepts and questions I hope to answer in this article.
- Basic Architecture Patterns - Hibernate and Rails each use each use completely different patterns of ORMs. What does that imply?
- Explicitness - Hibernate and Rails define mappings differently, Hibernate explicitly and Rails implicitly. What does that mean, and how important is it?
- Associations - Each supports associations, what do they look like for each of them?
- Transitive Persistence Models - How does each tool deal with persisting objects?
- Query Languages - How do you find objects?
- Performance Tuning - What options do I have to tune them?
Basic Architecture Differences
The core difference between Rails ActiveRecord and Hibernate is the architectural patterns the two are based off of. Rails, obviously, is using the ActiveRecord pattern, where as Hibernate uses the Data Mapper/Identity Map/Unit of Work patterns. Just knowing these two facts gives us some insight into potential differences.
Active Record PatternActiveRecord is "an object that wraps a row in a database table or view, encapsulates database access and adds domain logic on that data"[Fowler, 2003]. This means the ActiveRecord has "class" methods for finding instances, and each instance is responsible for saving, updating and deleting itself in the database. It's pretty well suited for simpler domain models, those where the tables closely resemble the domain model. It is also generally simpler then the more powerful, but complex Data Mapper pattern.
Data Mapper PatternThe Data Mapper is "a layer of mappers that moves data between objects and a database while keeping them independent of each other and the mapper itself"[Fowler, 2003]. It moves the responsibility of persistence out of the domain object, and generally uses an identity map to maintain the relationship between the domain objects and the database. In addition, it often (and Hibernate does) use a Unit of Work (Session) to keep track of objects which are changed and make sure they persist correctly.
General Pattern ImplicationsSo that covers the basic differences, the general implication of which should be fairly obvious. The ActiveRecord (Rails) will likely be easier to understand and work with, but past a certain point more advanced/complex usages will likely be difficult or just not possible. The question is is of course, when or if many projects will cross this line. Let's look at some specifics of the two frameworks. To illustrate these differences, we will be using code from my "Project Deadwood" sample app. (Guess what I've been watching lately.) :)
The Value of Explicitness
So what is explicitness worth? One of the key "features" of Ruby's ActiveRecord is that you don't need to specify the fields on class, but rather it dynamically determines its fields based off the columns in the database. So if you had a table "miners" that looked like this.
create table miners ( id BIGINT NOT NULL AUTO_INCREMENT, first_name VARCHAR(255), last_name VARCHAR(255), primary key (id) )
class Miner < ActiveRecord::Base end miner.first_name = "Brom"
On the other hand, your Hibernate class (Miner.java) specifies the fields, getters/setters and xdoclet tags looks like so.
package deadwood; /** * @hibernate.class table="miners" */ public class Miner { private Long id; private String firstName; private String lastName; /** * @hibernate.id generator-class="native" */ public Long getId() { return id; } public void setId(Long id) { this.id = id; } /** * @hibernate.property column="first_name" */ public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } /** * @hibernate.property column="last_name" */ public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } } miner.setFirstName("Brom");
Now, as I have mentioned before here, both Rails and Hibernate really only need to specify the field name once. Hibernate details are in the code, ActiveRecord details are in the database. So when you are browsing through your ruby code and you see this...
class GoldClaim < ActiveRecord::Base end
The question is, what fields does this object have? You have to fire up MySQL Front (or whatever) and look at the database schema. When the number of domain models is small, this isn't a big deal. But when your project has 40+ tables/50ish domain objects (like a Hibernate project I'm currently working on), this would seem pretty painful. Ultimately, this might come down to preference, but having the details in the code makes it easier to understand and alter.
Associations
In the last section, the Miner class we looked at was single table oriented, mapping to a single miners table. ORM solutions support ways to map associated tables to in memory objects, Hibernate and Rails are no different. Both handle the most of the basic mapping strategies. Here's a non-exhaustive list of association supported by both of them, including the corresponding Hibernate - Rails naming conventions where appropriate.
- Many to One/One to one - belongs_to/has_one
- One to Many (set) - has_many
- Many to Many (set) - has_and_belongs_to_many
- Single Table Inheritance
- Components (mapping > 1 object per table)
As a comparative example, lets look at the many to one relationship. We are going to expand our Deadwood example from part I. We add to the Miner a many to one association with a GoldClaim object. This means there is a foreign key, gold_claim_id in the miners table, which links it to a row in the gold_claims table.
(Java) public class Miner { // Other fields/methods omitted private GoldClaim goldClaim; /** * @hibernate.many-to-one column="gold_claim_id" * cascade="save" */ public GoldClaim getGoldClaim() { return goldClaim; } public void setGoldClaim(GoldClaim goldClaim) { this.goldClaim = goldClaim; } } (Rails) class Miner < ActiveRecord::Base belongs_to :gold_claim end
Not much real difference here, both do functionally the same thing. Hibernate uses explicit mapping to specify the foreign key column, as well as the cascade behavior, which we will talk about next. Saving a Miner will save its associated GoldClaim, but updates and deletes to it won't affect the associated object.
Transitive PersistenceNon-demo applications tend to work with big graphs of connected objects. Its important for an ORM solution to provide a way to detect and cascade changes from in memory objects to the database, without the need to manually save() each one. Hibernate features a flexible and powerful version of this via declarative cascading persistence. Rails seems to offer a limited version of this, based on the type of association. For example, Rails seems to emulates Hibernate's cascade="save" behavior for the belongs_to association by default, as the following Rails listing demonstrates...
miner = Miner.new("name" => "Brom Garrott") miner.gold_claim = GoldClaim.new( "name" => "Western Slope") miner.save # This saves both the Miner and GoldClaim objects miner.destroy # Deletes only the miner row from the databaseDeleting cascade="all" GoldClaim Miner
Miner miner = new Miner(); miner.setGoldClaim(new GoldClaim()); session.save(miner); // Saves Miner and GoldClaim objects. session.delete(miner); // Deletes both of them.Miner Updating GoldClaim
miner = Miner.find(@params['id']) miner.gold_claim.name = "Eastern Slope" miner.save
This does not update the gold_claim.name. From the opposite direction (has_one), this does work...
class GoldClaim < ActiveRecord::Base has_one :miner end claim = GoldClaim.find(@params['id']) claim.miner.name = "Seth Bullock" claim.save # Saves the miner's name
By using the cascade="save-update", you could get this behavior on any association, regardless of which table the foreign key lives in. Hibernate doesn't base the transistive persistence behavior off the relationship type, but rather the cascade style, which is much more fine grained and powerful. Next, lets look at how each framework finds the objects we have persisted.
Query Languages
While there have been a number of similarities to this point between the two frameworks, when the topic comes to query languages, capabilities and usage starts to differ rapidly. Rails essentially uses SQL, the well known standard for getting data in and out of databases. In addition, via the use of dynamic finder methods, it has what I think of as its own 'mini' language which lets developers write simplified queries by basically inventing methods. But ultimately everything is expressed in terms of tables and columns.
On the other hand, Hibernate has its own object oriented query language (Hibernate Query Language - HQL), which is deliberately very similar to SQL. How it differs is that it lets developers express their queries in terms of objects and fields instead of tables and columns. Hibernate translates the query into SQL optimized for your particular database. Obviously, inventing a new query language is very substantial task, but the expressiveness and power of it is one of Hibernate's selling points. Now let's a take a look at some samples of each of them.
Rails Insta-FindersFor simple queries, like 'find by property x and y', Rails lets you add dynamic finder methods which it will translate into SQL for you. Suppose for example, we want to find Miners based on first name and last name, you would write something like this.
@miners = Miner.find_by_first_name_and_last_name("Elma", "Garrott")where clause
# Returns only the first record @miner = Miner.find_first("first_name = ?", "Elma") # Finds up to 10 miners older than 30, ordered by age. @miners = Miner.find_all ["age > ?", 30], "age ASC", 10 # Like find all, but need complete SQL @minersWithSqA = Miner.find_by_sql [ "SELECT m.*, g.square_area FROM gold_claims g, miners m " + " WHERE g.square_area = ? and m.gold_claim_id = g.id", 1000]
The big thing to realize is that since Rails classes have dynamic fields, all columns returned by the result set are smashed on the Miner object. In the last query, the Miner gets a square_area field that it doesn't normally get. This means the view might have to change, like so...
# Normal association traversing <%= miner.gold_claim.square_area # Altered query for @minersWithSqA <%= miner.square_area %>Querying for Objects with HQL
As mentioned before, being able to express in terms of objects and columns really powerful. While simple queries are definitely easier with Rails, when you have to start navigating across objects with SQL, HQL can be very convenient alternative. Let's take a look at our sample queries for HQL.
// Find first Miner by name Query q = session.createQuery("from Miner m where m.firstName = :name"); q.setParameter("name", "Elma"); Miner m = (Miner) q.setMaxResults(1).uniqueResult(); // Finds up to 10 miners older than 30, ordered by age. Integer age = new Integer(30); Query q = session.createQuery( "from Miner m where m.age > :age order by age asc"); List miners = q.setParameter("age", age).setMaxResults(10).list(); // Similar to join query above, but no need to manually join Query q = session.createQuery( "from Miner m where m.goldClaim.squareArea = :area"); List minersWithSqA = q.setParameter("area", new Integer(1000)).list();Miner
${miner.goldClaim.squareArea} <%-- Traverse fields normally --%>
Having covered some of the basics of fetching objects, let's turn your attention to how we can make fetching objects fast. The next section covers the means by which we can tune the performance.
Performance Tuning
Beyond just mapping objects to tables, robust ORM solutions need to provide ways to tune the performance of the queries. One of the risks of working with ORM's is that you often pull back too much data from the database. This tends to happen because it its very easy to pull back several thousand rows, with multiple SQL queries, with a simple statement like "from Miner". Common ORM strategies for dealing with this include Lazy fetching, outer join fetching and caching.
Rails is very very LazyWhat I mean by lazy is that when you fetch an object, the ORM tool doesn't fetch data from other tables, until you request the association. This prevents loading to much unneeded data. Both Rails and Hibernate support lazy loading associations, but Hibernate allows you to choose which associations are lazy. For example, here's how it works with Rails...
@miner = Miner.find(1) # select * from miners where id = 1 @claim = @miner.gold_claim # select * from gold_claim where id = 1
This leads us to one of the great fallacies of ORM, that Lazy loading is always good. In reality, lazy loading is only good if you didn't need the data. Otherwise, you are doing with 2-1000+ queries what you could have done with one. This is dreaded N+1 select problem, where to get all the objects require N selects + 1 original selects. This problem gets much worse when you deal with collections..
Outer Joins and Explicit FetchingGenerally, one of the best way to improve performance is to limit the number of trips to the database. Better 1 big query than a few small ones. Hibernate has a number ways its handles the N+1 issue. Associations can be explicitly flagged for outer join fetching (via outer-join="true"), and you can add outer join fetching to HQL statements. For example...
/** * @hibernate.many-to-one column="gold_claim_id" * cascade="save-update" outer-join="true" */ public GoldClaim getGoldClaim() { return goldClaim; } // This does one select and fetches both the Miner and GoldClaim // and maps them correctly. Miner m = (Miner) session.load(Miner.class, new Long(1));
In addition, when selecting lists or dealing with collection associations, you can use an explicit outer join fetch, like so...
// Issues a single select, instead of 1 + N (where N is the # miners) List list = session.find("from Miner m left join fetch m.goldClaim");
The performance savings from this can very significant. On the other hand, Rails suffers badly from N+1 issues, and has limited means to solve this problem, other than writing explicit SQL joins, referred to as Piggy-back queries. The trouble is because Rails maps all the fields to the Miner object, you lose the association objects, meaning you need to alter your views and how you work with the domain model. Also, the query is fairly complicated, particularly if there are more than one association to be fetched. The @minersWithSqA query we did above is an example of a Piggy back query. In addition, all the additional fields are strings, losing their original type value. Queries get progressively worse as you add more associations.
CachingWhile object caching isn't always going to be helpful or a performance silver bullet, Hibernate has a huge potential advantage here. It provides several levels of caching, including a session (UnitOfWork) level as well as an optional second level cache. You always use the '1st level' cache, as it prevents circular references and multiple trips to the database for the same object. Using a second level cache can allow much of the database state to stay resident in memory. This is especially useful for frequently read and reference data. Rails essentially has no options for caching at the database level. (Though it does support caching for the web tier).
Conclusion
While this is by no means a complete coverage, we have looked at some of the high level differences between the two frameworks. Hopefully you should have a basic understanding of what the opportunity costs of either framework are. I have covered the basic architectural patterns which underly both Rails and Hibernate, as well as a how explicitness applies to both framework's basic persistent classes. For associations, there are quite a few more mappings that are possible with an ORM, but that covers the basics that most developers use.
Beyond the basics, Hibernate adds quite a few more mapping types (I documented a complete catalog of examples with over 20 mappings in Hibernate Quickly), including different inheritance strategies, custom user types, and maps of entity or simple types. While it's simpler to specify and use simple associations in Rails than in is in Hibernate, but there is less you can do with them. If you stick to the simple cases like single tables with a few associations, and you name your tables and columns right, Rails will likely do just fine, but for more complicated object models, Hibernate will be a better choice.
Rails and Hibernate are very different when it comes to query languages. While its not possible to do an exhaustive comparison of their query languages, generally selects for single objects/tables will be quicker and easier for Rails, and anything join related is better suited for Hibernate. Rails use SQL, which is familiar to most developers, while Hibernate offers HQL a OO query language that developers will need to learn. In addition, Hibernate offers quite a few more tuning opportunities, providing the necessary ORM mechanisms, like outer join fetching, configurable lazy fetching and second level caching. This further supports the theory that Rails is likely suitable for smaller projects, but its ORM layer lacks a number of the essential features that will allow it to scale up to larger projects.
References
- The Art of .war - Patrick's Weblog
- Hibernate - Project Homepage
- Ruby on Rails - Project Homepage
- Patterns of Enterprise Application Architecture - By Martin Fowler
- Getting Rolling with Rails - By Curt Hibbs
About the Author
Patrick Peak is co-author of Hibernate Quickly
Though he feels very weird about referring to himself in the third person, Patrick Peak is currently the Chief Technology Officer of Browsermedia, a Web DevelopmentDesign firm in Bethesda, MD. His focus is on using open-source as building block for rapid web software development.