Skip to content

HHH-19497 Addresses major issue fallback implementation of "in list" predicate #10253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

michaelfranz
Copy link

@michaelfranz michaelfranz commented May 31, 2025

When a NOT is applied to an IN, such as
cq.select(root).where(root.get("customerType").in(List.of(i1, i2)))
and when i1, i2 are represented by more than one column or formula (let's call these a and b), the previous implementation emitted:
(a <> ? or b <> ?) or (a <> ? or b <> ?) or ...
when the correct expression needs to be:
not ((a = ? and b = ?) or (a = ? and b = ?) or ...)

The previous implementation incorrectly distributes the negation to each item in the list. De Morgan 101. The predicate is logically incorrect and this can lead to unpredictable results, which can even include ALL the records being returned from a table. Very, very bad.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.


https://hibernate.atlassian.net/browse/HHH-19497

michaelfranz and others added 5 commits May 28, 2025 17:11
When a NOT is applied to an IN, such as 
cq.select(root).where(root.get("customerType").in(List.of(i1, i2)))
and when i1, i2 are represented by more than one column or formula (let's call these a and b), the previous implementation emitted:
(a <> ? OR b <> ?) OR (a <> ? OR b <> ?) OR ...
when the correct expression needs to be:
NOT ((a = ? AND b = ?) OR (a = ? AND b = ?))
So previous implementation incorrectly distributes the negation to each item in the list. De Morgan 101. The predicate is logically incorrect and this can lead to unpredictable results, which can even include ALL the records being returned from a table; or ZERO records being returned.
When a NOT is applied to an IN, such as
cq.select(root).where(root.get("customerType").in(List.of(i1, i2)))
and when i1, i2 are represented by more than one column or formula (let's call these a and b), the previous implementation emitted:
(a <> ? OR b <> ?) OR (a <> ? OR b <> ?) OR ...
when the correct expression needs to be:
NOT ((a = ? AND b = ?) OR (a = ? AND b = ?))
So previous implementation incorrectly distributes the negation to each item in the list. De Morgan 101. The predicate is logically incorrect and this can lead to unpredictable results, which can even include ALL the records being returned from a table; or ZERO records being returned.
…n' into michaelfranz-de-morgan-correction

# Conflicts:
#	hibernate-core/src/main/java/org/hibernate/sql/ast/spi/AbstractSqlAstTranslator.java
@hibernate-github-bot
Copy link

hibernate-github-bot bot commented May 31, 2025

Thanks for your pull request!

This pull request does not follow the contribution rules. Could you have a look?

❌ All commit messages should start with a JIRA issue key matching pattern HHH-\d+
    ↳ Offending commits: [ffbd293, 3973f2e, ba5c40c, fa3a1c4, 14073b0]

› This message was automatically generated.

michaelfranz and others added 4 commits May 31, 2025 08:44
…predicate

When a NOT is applied to an IN, such as
cq.select(root).where(root.get("customerType").in(List.of(i1, i2)))
and when i1, i2 are represented by more than one column or formula (let's call these a and b), the previous implementation emitted:
(a <> ? OR b <> ?) OR (a <> ? OR b <> ?) OR ...
when the correct expression needs to be:
NOT ((a = ? AND b = ?) OR (a = ? AND b = ?))
So previous implementation incorrectly distributes the negation to each item in the list. De Morgan 101. The predicate is logically incorrect and this can lead to unpredictable results, which can even include ALL the records being returned from a table; or ZERO records being returned.
@gavinking
Copy link
Member

As I mentioned on the issue report, there's two ways to fix this problem:

  1. the way you've chosen, which is to just use not, or
  2. alternatively, by replacing the incorrect ors with ands.

I'm not sure why the code was written to distribute the not and using <> instead of =. But it seems strange that it would have been written that way without some good reason. So we need to understand why it was written like that. @beikov, do you know?

@gavinking
Copy link
Member

In the meantime, please:

  1. add a test, and
  2. squash the commits.

Mike Mannion and others added 4 commits May 31, 2025 12:02
…predicate

When a NOT is applied to an IN, such as
cq.select(root).where(root.get("customerType").in(List.of(i1, i2)))
and when i1, i2 are represented by more than one column or formula (let's call these a and b), the previous implementation emitted:
(a <> ? OR b <> ?) OR (a <> ? OR b <> ?) OR ...
when the correct expression needs to be:
NOT ((a = ? AND b = ?) OR (a = ? AND b = ?))
So previous implementation incorrectly distributes the negation to each item in the list. De Morgan 101. The predicate is logically incorrect and this can lead to unpredictable results, which can even include ALL the records being returned from a table; or ZERO records being returned.
@michaelfranz
Copy link
Author

michaelfranz commented May 31, 2025

In the meantime, please:

  1. add a test, and
  2. squash the commits.

I'm new to the project and I'm unsure about the right way to provide a test. My discovery of the issue came from a client's Hibernate 6 / DB2-environment. I verified the solution outside of this environment by overriding H2Dialect and H2SqlAstTranslator with implementations that force the issue to come to light. Therefore, though I appreciate it's not ideal, I attached all the relevant artefacts to the original JIRA ticket.

If someone would can point me to guidance on how to add a suitable test to the project, then I will do so if I find the time. In the meantime, I've provided everything I have.

As I mentioned on the issue report, there's two ways to fix this problem:

  1. the way you've chosen, which is to just use not, or
  2. alternatively, by replacing the incorrect ors with ands.

I'm not sure why the code was written to distribute the not and using <> instead of =. But it seems strange that it would have been written that way without some good reason. So we need to understand why it was written like that. @beikov, do you know?

The "good reason", I suppose, is that it is marginally more optimal to do it in that way. I went for the "not" because it's more intuitive. I prioritised correct logic over optimisation. I'll leave it to regular contributors to decide which way they want to take it from here.

@michaelfranz michaelfranz changed the title Addresses major issue fallback implementation of "in list" predicate HHH-19497 Addresses major issue fallback implementation of "in list" predicate May 31, 2025
@gavinking
Copy link
Member

The "good reason", I suppose, is that it is marginally more optimal to do it in that way.

If it's true that it's marginally more optimal, then we should always do things in the marginally more optimal way.

But I don't see an obvious reason why it would be, so I would prefer to wait for Christian to chime in.

If someone would can point me to guidance on how to add a suitable test to the project, then I will do so if I find the time. In the meantime, I've provided everything I have.

To be clear: a test is an absolute hard requirement. We never fix bugs in Hibernate without a test, and every bugfix must be accompanied by a test.

There is a template here: https://github.com/hibernate/hibernate-test-case-templates

Mike Mannion added 4 commits May 31, 2025 20:31
…-correction' into michaelfranz-de-morgan-correction
…-correction' into michaelfranz-de-morgan-correction
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants