Rows duplication for certain NHibernate queries – workaround
At my company we are currently developing the web-application with quite complex business logic, which database access is almost 100% based on NHibernate. Having abstractions from NHibernate allowed us to design some pieces of its security model very nicely. However we’ve recently noticed that we are getting incorrect results from some queries (surprisingly only when using SQL Server and not SQLite).
Basically the problem are duplicated rows in the queries where duplication should not really be possible, i.e., the queries where it is explicitly avoided via DISTINCT projection. Moreover we observed that the problem affects only paged queries (starting from 2nd page on).
To understand the problem lets track it back to the database level. As we already found out that this needs to be somehow related to paged queries, lets investigate the underlying SQL for the following piece of NH code:
var results = session .CreateCriteria() .SetProjection(Projections.Distinct(Projections.ProjectionList() .Add(Projections.Property("Name"), "Name"))) .SetResultTransformer(Transformers.AliasToBean()) .SetFirstResult(2).SetMaxResults(2) .List();
The SQL generated for SQL Server for the above:
SELECT TOP 2 y0_ FROM (SELECT distinct this_.Name as y0_, ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row FROM Test this_) as query WHERE query.__hibernate_sort_row > 2 ORDER BY query.__hibernate_sort_row
Try running the above SQL example against simple Test table with Id and Name columns, where some names are duplicated like this:
You will get the following result:
Instead of the expected one, which is:
Why does it happen? Well DISTINCT applies to the entire column set and ROW_NUMBER() is just another column on this projection. This ultimately makes each row from the original table unique. How to correct that? Obviously we need to get rid of one of those operators from the projection. I decided to go with a subquery here (any other ideas?):
SELECT TOP 2 y0_ FROM (SELECT *, ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row FROM (SELECT distinct this_.Name as y0_ FROM Test this_) as q_) as query WHERE query.__hibernate_sort_row > 2 ORDER BY query.__hibernate_sort_row
You can check that the above SQL provides the correct results.
It is a little more challenging to fix this in NHibernate itself esp. I have never done any change to it before. On the side note, while NHibernate is great for many things, sometimes it is just PITA – my colleague spent a whole day on optimizing NHibernate query, while if it would be in SQL he would have just changed one line!
Initially our team looked for the patch on the NHibernate bug tracker. Although we found the open issue for this it was not very helpful (no patch/solution included). As the major demo for system target users was soon I decided to try fixing it on my own.
As paging is quite specific to the target database (as I noted before i.e. SQLite is not affected) the place to modify would be SQL Server dialect component. We are using SQL Server 2008 but its dialect is merely overriding some minor methods from MsSql2005Dialect class, where we should really be looking at. The method to be tweaked is GetLimitString. It gets the SQL statement produced so far and rewrites it a little bit to include paging piece (for SQLite it just appends LIMIT … OFFSET statement, which in this case is independent from projection and hence it works well).
The change to the single class file is sufficient and quite simple indeed. Except from magic I had to include to substitute column names with aliases (I must admit this is part I am not particularly proud of but in the end it proved to be working). As this is entry is getting a little bit long instead pasting this code I attach patch you can use to fix the current stable NHibernate version (which is, as the time of writing, 2.1.2 GA). The patch is also submitted to NHibernate JIRA alongside with failing test I was asked to provide (you can read on NHibernate bugs submission policy here). I hope that we will get the official fix for this soon, and till then that this post saves a day for someone.
Update – unfortunatelly it seems that the issue won’t be fix in 2.x NHibernate branch. Fortunately David (whose comments you can see below) found cleaner solution to the problem. It still uses the corrected dialect I created. However in his solution custom dialect extension point is being leveraged. As this does not require NHibernate custom rebuild you should consider this approach superior. Please refer to David’s blog post about this for further details.
Duplicate rows NHibernate issue work around patch.