Skip to content
/ server Public

MDEV-38649: Wrong result upon condition pushdown on table with DESC key#4687

Open
DaveGosselin-MariaDB wants to merge 2 commits into12.3from
12.3-MDEV-38649-wrong-result-icp
Open

MDEV-38649: Wrong result upon condition pushdown on table with DESC key#4687
DaveGosselin-MariaDB wants to merge 2 commits into12.3from
12.3-MDEV-38649-wrong-result-icp

Conversation

@DaveGosselin-MariaDB
Copy link
Member

MDEV-38649: Wrong result upon condition pushdown on table with DESC key

QUICK_SELECT_DESC::get_next() sets a stale end range on the storage engine.
When checking the first range, it sets last_range to the first range and
then, if it cannot avoid descending scan, it sets the minimum endpoing on
the storage engine.  However, get_next() may decide that this range is not
a match after updating the minimum endpoint on the storage engine.  In this
case, it will continue to the next range.  When continuing, it needs to
reset not only the state of 'last_range' to point to the next range that
it's checking, but it also needs to clear the now-stale end range set on
the storage engine.  While this explanation covers the first and second
ranges, the issue at hand extends to the general case of ranges 1...N and
N+1.

Before this change and when get_next() decided that a range was not a match, it
would clear last_range at each loop continuation point.  Rather than clearing
the minimum endpoint on the storage engine at this point, we move any loop
cleanup to a lambda that's invoked by the for loop directly.  This consolidates
any logic that needs to be evaluated on loop continuation to one place and
reduces code duplication.

MySQL fixed this same problem at sha b65ca959efd6ec5369165b1849407318b4886634
with a different implementation.

Initial test case that passes because bug is present.
QUICK_SELECT_DESC::get_next() sets a stale end range on the storage engine.
When checking the first range, it sets last_range to the first range and
then, if it cannot avoid descending scan, it sets the minimum endpoing on
the storage engine.  However, get_next() may decide that this range is not
a match after updating the minimum endpoint on the storage engine.  In this
case, it will continue to the next range.  When continuing, it needs to
reset not only the state of 'last_range' to point to the next range that
it's checking, but it also needs to clear the now-stale end range set on
the storage engine.  While this explanation covers the first and second
ranges, the issue at hand extends to the general case of ranges 1...N and
N+1.

Before this change and when get_next() decided that a range was not a match, it
would clear last_range at each loop continuation point.  Rather than clearing
the minimum endpoint on the storage engine at this point, we move any loop
cleanup to a lambda that's invoked by the for loop directly.  This consolidates
any logic that needs to be evaluated on loop continuation to one place and
reduces code duplication.

MySQL fixed this same problem at sha b65ca959efd6ec5369165b1849407318b4886634
with a different implementation.
Copy link
Member

@spetrunia spetrunia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the above and ok to push.

file->set_end_range(nullptr);
};

for (;; loop_reset())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda is only used in one place. Can the loop be:

  for (;; last_range= nullptr, file->set_end_range(nullptr))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I liked the lambda because it is like a signpost saying "add loop reset stuff here" but it's not really needed. Your suggestion is nice and short, so I'll use it. We can bring the lambda back later if the list grows and makes the for loop difficult to read.

--echo #
CREATE TABLE t1 (a INT, b INT, KEY(b DESC));
INSERT INTO t1 VALUES (2,8),(9,1),(4,6);
SELECT a, b FROM t1 GROUP BY b HAVING b != 0 ORDER BY a;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As was noted in the MDEV comments, condition_pushdown_from_having is not related to the issue.

But I would take testcase critique in a different direction :-)

Grouping operation is not necessary here at all. This will already show the problem:

SELECT a, b FROM t use index() WHERE b != 0 ORDER BY b;
SELECT a, b FROM t WHERE b != 0 ORDER BY b;

UPD: also, descending index is not needed.
Change table definition to use NOT NULL and ORDER BY use DESC:

CREATE TABLE t2 (a INT, b INT not null, KEY(b));
SELECT a, b FROM t2 WHERE b != 10 ORDER BY b DESC;

The only thing the GROUP BY cases highlight is that grouping code is scanning the index in reverse direction for no reason. It could have scanned it in the forward direction too and that would be slightly faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll update the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants