Skip to content

NH-1930 - Filter condition on nullable many to one should be on the join, not the where #1208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nhibernate-bot opened this issue Oct 12, 2017 · 4 comments

Comments

@nhibernate-bot
Copy link
Collaborator

ayenderahien created an issue — 13th August 2009, 23:47:42:

Using the code in NH-1919, get produces the following values for nullable property that has a filter on it:

SELECT invoice0*.id          AS id1_1*, 
       invoice0*.otherprop   AS otherprop1_1*, 
       invoice0*.categoryid  AS categoryid1_1*, 
       category1*.id         AS id0_0*, 
       category1*.validuntil AS validuntil0_0* 
FROM   invoice invoice0_ 
       LEFT OUTER JOIN category category1_ 
         ON invoice0*.categoryid = category1*.id 
WHERE  category1_.validuntil > @p0 
       AND invoice0_.id = @p1

The problem is that the filter should be applied on the join, not on the where, since it filter the parent as well as the child.


Dmitry Kustov added a comment — 29th October 2009, 3:04:19:

I made this in NHibernate.SqlCommand.ANSIJoinFragment. It works fine

public override bool AddCondition(string condition)
{
//DK return AddCondition(conditions, condition);
return AddCondition(buffer, condition);
}

  public override bool AddCondition(SqlString condition)
  {
  	//DK			return AddCondition(conditions, condition);
  	return AddCondition(buffer, condition);
  }

Ricardo Peres added a comment — 23rd May 2011, 6:29:40:

Same happens when specifying a where clause on a set using fetch mode join.
I think the restriction should be placed on the join on clause, not the global where.


Robert Snyder added a comment — 3rd July 2012, 19:54:08:

The filter condition being applied to the where instead of the on clause of the join does provide the wrong data in an outer join - the results are filtered to the same as if it were an inner join. This continues to be a problem for us in 3.3.1. In addition however, it can be a huge performance problem on an inner join - if the condition can filter rows out on the join the working set can be dramatically reduced. I think it very important to be able to specify where the condition is applied. In general it should always be applied in the join, but there are always exceptions - even to the point of wanting to put it in both places (admittedly not common).


Robert Snyder added a comment — 3rd July 2012, 20:46:04:

Attached is a project with database that demonstrates the issue.

-- The NHibernate filter generates
SELECT this*.Id as Id1_1*,
       this*.Data as Data1_1*,
       this*.TableBId as TableBId1_1*,
       this*.CompanyId as CompanyId1_1*,
       tableb1*.Id as Id2_0*,
       tableb1*.Data as Data2_0*,
       tableb1*.CompanyId as CompanyId2_0*
FROM <TableA> this_ 
   left outer join [TableB] tableb1* on this_.TableBId=tableb1*.Id
WHERE (tableb1*.CompanyId = @p0) and (this*.CompanyId = @p0)

-- But it should generate (or at least be able to be told to generate)
SELECT this*.Id as Id1_1*,
       this*.Data as Data1_1*,
       this*.TableBId as TableBId1_1*,
       this*.CompanyId as CompanyId1_1*,
       tableb1*.Id as Id2_0*,
       tableb1*.Data as Data2_0*,
       tableb1*.CompanyId as CompanyId2_0*
FROM <TableA> this_ 
   left outer join [TableB] tableb1* on this_.CompanyId = tableb1_.CompanyId and this_.TableBId=tableb1*.Id
WHERE (this_.CompanyId = @p0)

-- or even
SELECT this*.Id as Id1_1*,
       this*.Data as Data1_1*,
       this*.TableBId as TableBId1_1*,
       this*.CompanyId as CompanyId1_1*,
       tableb1*.Id as Id2_0*,
       tableb1*.Data as Data2_0*,
       tableb1*.CompanyId as CompanyId2_0*
FROM <TableA> this_ 
   left outer join [TableB] tableb1* on tableb1_.CompanyId = @p0 and this_.TableBId=tableb1*.Id
WHERE (this_.CompanyId = @p0)

Leaving columns playing the role that CompanyId plays here out of the join carries severe penalties - in a multi-tenant database especially.
1 - Wrong Data: for an outer join, putting the condition in the where clause filters the result to produce a set identical to that of an inner join.
1a - In order to get around that, we have to include an 'OR (tableb1_.CompanyId IS NULL)' clause, which defeats any index on CompanyId.
2 - Performance: In a multi-tenant db the most efficient index for data is one with the tenancy column(s) first. We do this in SQL Server by defining clustered indexes that start with the tenancy columns. For a single record-fetch by primary key, the penalty is in performance, not working set size, but especially for queries that return multiple records, searching a primary key across the entire space of all tenants is far more expensive than using a tenancy-based clustered index.
3 - Performance: In cases where sets are being searched based on criteria not including the primary key, not including the filter columns in the join (regardless of tenancy design) results in all records being joined into memory, then filtering off the ones that don't match - a huge penalty to both performance and scalability.
There should be a way to control whether such conditions are put on the join or the where clause (the join would be a better default in my opinion). Ideally, it should be possible to specify it in BOTH places, but that is a less common need.

Note that this example uses just two tables, but in practice there will often be a significant chain of tables, each requiring the tenancy columns in the join clause. The deeper the chain, the greater the issue, more or less geometrically.


Adrian T Wright added a comment — 11th September 2012, 20:00:09:

I am experiencing this issue as well. In a large data set, there will be a noticeable performance degradation. A fix would be very helpful for my team.


Kat Lim Ruiz added a comment — 16th September 2014, 16:31:12:

I have a fix, which seems to work:

                var filterDef = new FilterDefinition(
                    "f_CurrentTenant",
                    "TenantId = :f_tenantId",
                    new Dictionary<string, IType> { { "f_tenantId", NHibernateUtil.Guid } },
                    false);

Set the last parameter useManyToOne = False. This will make the filter be only applied at the WHERE for the main entity and not to the LEFT JOINs.

UPDATE NH-3461 also proposed this solution. I'd say this ticket should be closed then.


Alexander Zaytsev added a comment — 16th September 2014, 21:29:14:

<~katlim_ruiz> the issue is about applying filter on join instead on where clause.

@EnemyArea
Copy link
Contributor

any chance to get this fixed soon :) ? Have the same issue at the moment.

@fredericDelaporte
Copy link
Member

fredericDelaporte commented May 8, 2018

It seems this interests many people, but no one has yet proposed a PR.

@fredericDelaporte
Copy link
Member

This has maybe be fixed by #268. A test case for checking this would be welcome.

@hazzik
Copy link
Member

hazzik commented Dec 6, 2018

No, it is not. I have tests and a fix (for a loooooooong time), but I don’t like it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants