Home > Software > Rows duplication for certain NHibernate queries – workaround

Rows duplication for certain NHibernate queries – workaround

September 7th, 2010 Leave a comment Go to comments

At my company we are currently developing the web-application with quite complex business logic, which database access is almost 100% based on NHibernate. Having abstractions from NHibernate allowed us to design some pieces of its security model very nicely. However we’ve recently noticed that we are getting incorrect results from some queries (surprisingly only when using SQL Server and not SQLite).

Basically the problem are duplicated rows in the queries where duplication should not really be possible, i.e., the queries where it is explicitly avoided via DISTINCT projection. Moreover we observed that the problem affects only paged queries (starting from 2nd page on).

To understand the problem lets track it back to the database level. As we already found out that this needs to be somehow related to paged queries, lets investigate the underlying SQL for the following piece of NH code:

var results = session
    .CreateCriteria()
    .SetProjection(Projections.Distinct(Projections.ProjectionList()
    .Add(Projections.Property("Name"), "Name")))
    .SetResultTransformer(Transformers.AliasToBean())
    .SetFirstResult(2).SetMaxResults(2)
    .List();

The SQL generated for SQL Server for the above:

SELECT TOP 2 y0_
FROM (SELECT distinct this_.Name as y0_,
		ROW_NUMBER()
		OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row
	FROM   Test this_) as query
WHERE    query.__hibernate_sort_row > 2
ORDER BY query.__hibernate_sort_row

Try running the above SQL example against simple Test table with Id and Name columns, where some names are duplicated like this:

Table with duplicated rows

You will get the following result:

Incorrect query result

Instead of the expected one, which is:

Correct query result

Why does it happen? Well DISTINCT applies to the entire column set and ROW_NUMBER() is just another column on this projection. This ultimately makes each row from the original table unique. How to correct that? Obviously we need to get rid of one of those operators from the projection. I decided to go with a subquery here (any other ideas?):

SELECT TOP 2 y0_
FROM (SELECT *,
			ROW_NUMBER()
			OVER(ORDER BY CURRENT_TIMESTAMP) as __hibernate_sort_row
		FROM (SELECT distinct this_.Name as y0_
				FROM   Test this_) as q_) as query
WHERE    query.__hibernate_sort_row > 2
ORDER BY query.__hibernate_sort_row

You can check that the above SQL provides the correct results.

It is a little more challenging to fix this in NHibernate itself esp. I have never done any change to it before. On the side note, while NHibernate is great for many things, sometimes it is just PITA – my colleague spent a whole day on optimizing NHibernate query, while if it would be in SQL he would have just changed one line!

Initially our team looked for the patch on the NHibernate bug tracker. Although we found the open issue for this it was not very helpful (no patch/solution included). As the major demo for system target users was soon I decided to try fixing it on my own.

As paging is quite specific to the target database (as I noted before i.e. SQLite is not affected) the place to modify would be SQL Server dialect component. We are using SQL Server 2008 but its dialect is merely overriding some minor methods from MsSql2005Dialect class, where we should really be looking at. The method to be tweaked is GetLimitString. It gets the SQL statement produced so far and rewrites it a little bit to include paging piece (for SQLite it just appends LIMIT … OFFSET statement, which in this case is independent from projection and hence it works well).

The change to the single class file is sufficient and quite simple indeed. Except from magic I had to include to substitute column names with aliases (I must admit this is part I am not particularly proud of but in the end it proved to be working). As this is entry is getting a little bit long instead pasting this code I attach patch you can use to fix the current stable NHibernate version (which is, as the time of writing, 2.1.2 GA). The patch is also submitted to NHibernate JIRA alongside with failing test I was asked to provide (you can read on NHibernate bugs submission policy here). I hope that we will get the official fix for this soon, and till then that this post saves a day for someone.

Update – unfortunatelly it seems that the issue won’t be fix in 2.x NHibernate branch. Fortunately David (whose comments you can see below) found cleaner solution to the problem. It still uses the corrected dialect I created. However in his solution custom dialect extension point is being leveraged. As this does not require NHibernate custom rebuild you should consider this approach superior. Please refer to David’s blog post about this for further details.

Attached Files:

  • zip NH-2214

    Duplicate rows NHibernate issue work around patch.

Categories: Software Tags: , ,
  1. Roman Melnyk
    October 6th, 2010 at 18:40 | #1

    Great job, Marcin.
    Ran into the same problem, and your solution saved me a lot of time.
    Thanks.

  2. October 23rd, 2010 at 07:23 | #2

    I just signed up to your blogs rss feed. Will you post more on this subject?

  3. October 23rd, 2010 at 20:39 | #3

    The patch has been submitted and I have recently got an update that this unfortunately won’t be fixed in 2.x version. The problem will be fixed in 3.0 though. To be honest I am not particularly happy about this. Details of the discussion can be found on JIRA (link can be found in the post above).

  4. david
    November 2nd, 2010 at 17:53 | #4

    hi Marcin,

    I have just come across this problem, and i guess it effects a lot more of my applicatin that I thought. Basically any “slighlty” complex data grid of mine has duplicates!

    You mention something about using a detached criteria as a work around? would you have an example of how to achieve that?

    • November 2nd, 2010 at 19:25 | #5

      David,
      do you use distinct projection in your paged queries? If yes then the problem I described fits and the only solution (up to my knowledge) is to fix Dialect on your own, i.e., you need to check out sources of NH version you are using, correcting dialect using patch I provided and building your “customized” NHibernate.dll.

      If you do not use distinct together with paging then your problem may be related to joins. See the following Stack Overflow article for the solution: http://stackoverflow.com/questions/318157/get-distinct-result-set-from-nhibernate-using-criteria-api

  5. david
    November 3rd, 2010 at 10:25 | #6

    Thanks Marcin,

    Yes, I use Projections.Distinct(Projections.Property(“some.unique.Id”))

    why haven’t they accepted your patch at nhibernate yet? its blatently a bug! Do you know if they have any suggested work arounds, if the don’t want to accept your patch?

  6. November 3rd, 2010 at 18:36 | #7

    To be honest I don’t get the reasonable NH dev provided but basically he said 2.x version is not being developed anymore and hence no more patches/changes are going to be applied there. I believe the suggestion is to use 3.x. I am not sure if this is already fixed there though but from the ticket communication between me, Ayende and Fabio (and the fact ticket is still open) I suspect it may not yet be. My main concern though is that I don’t want/can use non production ready version in my production systems…

  7. david
    November 4th, 2010 at 11:10 | #8

    Just to say thanks, for your help. I got things working. Might write a post about it if I get time, this is such asimple problem to fix.

    I could not rebuild me own version of nhibernate, as it other projects had dependancies on it, so I just created a custom SQL dialect, based on your solution, so in my fluent config I have something like

    Fluently.Configure()
    .Database(MsSqlConfiguration
    .MsSql2008
    .Dialect()


    Thanks again

  8. November 6th, 2010 at 15:39 | #9

    Brilliant idea to write custom dialect! Please let me know about you blog post once it is written. Thanks.

  9. david
    November 11th, 2010 at 13:04 | #10

    Hi Marcin,

    The post is here: http://www.webdevbros.net/

  10. Marc Villella
    January 13th, 2011 at 17:24 | #11

    Marcin,

    Is the MsSql2005Dialect.cs file submitted to nhibernate your “patched” version, or the original? I assume its the patched version, but wanted to make sure as I’ve encountered another issue that I’m not yet sure is related.

    Thanks in advance and great find/fix!

    Marc

  11. January 14th, 2011 at 22:13 | #12

    Yes the patch as well as unit test reproducing the error were submitted to NHibernate maintainers. I don’t believe 2.x branch was updated though. I am not sure about 3.x as we are not using it yet.

    I would suggest to follow instructions provided on David’s blog as they are less invasive and do not require you to create custom NHibernate build. If customized dialect resolves the problem you encounter this means you are using unpatched version.

  12. Richb
    April 5th, 2011 at 15:57 | #13

    It’s a nightmare trying to use this. It would be so much better if you’d put a unified diff patch onto the JIRA issue and done so without swapping tabs/spaces.

    • April 5th, 2011 at 19:17 | #14

      Richb, I would suggest to go to the David’s blog post I linked above. He took much cleaner approach of implementing custom dialect. I believe he provides copy of the dialect for download.

  13. November 16th, 2011 at 02:36 | #15

    Hey guys,

    I actually got the same original issue, but I’ve realized that I could actually switch the Distinct projection to a GroupBy projection instead. It also gets me the results that I wanted, but the ROW_NUMBER doesn’t get mixed with the group by clause, so the pagination thing works.

    This solution might help someone eventually.

    Great job with the new dialect tough! I guess this makes things easier, but I didn’t want to change the dialect to a specific implementation, so I’ll just stick with the GroupBy solution.

  1. November 11th, 2010 at 13:03 | #1