Rows duplication for certain NHibernate queries – workaround
At my company we are currently developing the web-application with quite complex business logic, which database access is almost 100% based on NHibernate. Having abstractions from NHibernate allowed us to design some pieces of its security model very nicely. However we’ve recently noticed that we are getting incorrect results from some queries (surprisingly only when using SQL Server and not SQLite).
Basically the problem are duplicated rows in the queries where duplication should not really be possible, i.e., the queries where it is explicitly avoided via DISTINCT projection. Moreover we observed that the problem affects only paged queries (starting from 2nd page on).
To understand the problem lets track it back to the database level. As we already found out that this needs to be somehow related to paged queries, lets investigate the underlying SQL for the following piece of NH code:
var results = session
.CreateCriteria()
.SetProjection(Projections.Distinct(Projections.ProjectionList()
.Add(Projections.Property("Name"), "Name")))
.SetResultTransformer(Transformers.AliasToBean())
.SetFirstResult(2).SetMaxResults(2)
.List();
The SQL generated for SQL Server for the above:
SELECT TOP 2 y0_ FROM (SELECT distinct this_.Name as y0_, ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row FROM Test this_) as query WHERE query.__hibernate_sort_row > 2 ORDER BY query.__hibernate_sort_row
Try running the above SQL example against simple Test table with Id and Name columns, where some names are duplicated like this:

You will get the following result:

Instead of the expected one, which is:

Why does it happen? Well DISTINCT applies to the entire column set and ROW_NUMBER() is just another column on this projection. This ultimately makes each row from the original table unique. How to correct that? Obviously we need to get rid of one of those operators from the projection. I decided to go with a subquery here (any other ideas?):
SELECT TOP 2 y0_ FROM (SELECT *, ROW_NUMBER() OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row FROM (SELECT distinct this_.Name as y0_ FROM Test this_) as q_) as query WHERE query.__hibernate_sort_row > 2 ORDER BY query.__hibernate_sort_row
You can check that the above SQL provides the correct results.
It is a little more challenging to fix this in NHibernate itself esp. I have never done any change to it before. On the side note, while NHibernate is great for many things, sometimes it is just PITA – my colleague spent a whole day on optimizing NHibernate query, while if it would be in SQL he would have just changed one line!
Initially our team looked for the patch on the NHibernate bug tracker. Although we found the open issue for this it was not very helpful (no patch/solution included). As the major demo for system target users was soon I decided to try fixing it on my own.
As paging is quite specific to the target database (as I noted before i.e. SQLite is not affected) the place to modify would be SQL Server dialect component. We are using SQL Server 2008 but its dialect is merely overriding some minor methods from MsSql2005Dialect class, where we should really be looking at. The method to be tweaked is GetLimitString. It gets the SQL statement produced so far and rewrites it a little bit to include paging piece (for SQLite it just appends LIMIT … OFFSET statement, which in this case is independent from projection and hence it works well).
The change to the single class file is sufficient and quite simple indeed. Except from magic I had to include to substitute column names with aliases (I must admit this is part I am not particularly proud of but in the end it proved to be working). As this is entry is getting a little bit long instead pasting this code I attach patch you can use to fix the current stable NHibernate version (which is, as the time of writing, 2.1.2 GA). The patch is also submitted to NHibernate JIRA alongside with failing test I was asked to provide (you can read on NHibernate bugs submission policy here). I hope that we will get the official fix for this soon, and till then that this post saves a day for someone.
Update – unfortunatelly it seems that the issue won’t be fix in 2.x NHibernate branch. Fortunately David (whose comments you can see below) found cleaner solution to the problem. It still uses the corrected dialect I created. However in his solution custom dialect extension point is being leveraged. As this does not require NHibernate custom rebuild you should consider this approach superior. Please refer to David’s blog post about this for further details.
Attached Files:
NH-2214Duplicate rows NHibernate issue work around patch.
Great job, Marcin.
Ran into the same problem, and your solution saved me a lot of time.
Thanks.
I just signed up to your blogs rss feed. Will you post more on this subject?
The patch has been submitted and I have recently got an update that this unfortunately won’t be fixed in 2.x version. The problem will be fixed in 3.0 though. To be honest I am not particularly happy about this. Details of the discussion can be found on JIRA (link can be found in the post above).
hi Marcin,
I have just come across this problem, and i guess it effects a lot more of my applicatin that I thought. Basically any “slighlty” complex data grid of mine has duplicates!
You mention something about using a detached criteria as a work around? would you have an example of how to achieve that?
David,
do you use distinct projection in your paged queries? If yes then the problem I described fits and the only solution (up to my knowledge) is to fix Dialect on your own, i.e., you need to check out sources of NH version you are using, correcting dialect using patch I provided and building your “customized” NHibernate.dll.
If you do not use distinct together with paging then your problem may be related to joins. See the following Stack Overflow article for the solution: http://stackoverflow.com/questions/318157/get-distinct-result-set-from-nhibernate-using-criteria-api
Thanks Marcin,
Yes, I use Projections.Distinct(Projections.Property(“some.unique.Id”))
why haven’t they accepted your patch at nhibernate yet? its blatently a bug! Do you know if they have any suggested work arounds, if the don’t want to accept your patch?
To be honest I don’t get the reasonable NH dev provided but basically he said 2.x version is not being developed anymore and hence no more patches/changes are going to be applied there. I believe the suggestion is to use 3.x. I am not sure if this is already fixed there though but from the ticket communication between me, Ayende and Fabio (and the fact ticket is still open) I suspect it may not yet be. My main concern though is that I don’t want/can use non production ready version in my production systems…
Just to say thanks, for your help. I got things working. Might write a post about it if I get time, this is such asimple problem to fix.
I could not rebuild me own version of nhibernate, as it other projects had dependancies on it, so I just created a custom SQL dialect, based on your solution, so in my fluent config I have something like
…
Fluently.Configure()
.Database(MsSqlConfiguration
.MsSql2008
.Dialect()
…
…
Thanks again
Brilliant idea to write custom dialect! Please let me know about you blog post once it is written. Thanks.
Hi Marcin,
The post is here: http://www.webdevbros.net/
Marcin,
Is the MsSql2005Dialect.cs file submitted to nhibernate your “patched” version, or the original? I assume its the patched version, but wanted to make sure as I’ve encountered another issue that I’m not yet sure is related.
Thanks in advance and great find/fix!
Marc
Yes the patch as well as unit test reproducing the error were submitted to NHibernate maintainers. I don’t believe 2.x branch was updated though. I am not sure about 3.x as we are not using it yet.
I would suggest to follow instructions provided on David’s blog as they are less invasive and do not require you to create custom NHibernate build. If customized dialect resolves the problem you encounter this means you are using unpatched version.
It’s a nightmare trying to use this. It would be so much better if you’d put a unified diff patch onto the JIRA issue and done so without swapping tabs/spaces.
Richb, I would suggest to go to the David’s blog post I linked above. He took much cleaner approach of implementing custom dialect. I believe he provides copy of the dialect for download.
Hey guys,
I actually got the same original issue, but I’ve realized that I could actually switch the Distinct projection to a GroupBy projection instead. It also gets me the results that I wanted, but the ROW_NUMBER doesn’t get mixed with the group by clause, so the pagination thing works.
This solution might help someone eventually.
Great job with the new dialect tough! I guess this makes things easier, but I didn’t want to change the dialect to a specific implementation, so I’ll just stick with the GroupBy solution.