Skip to content
/ server Public

MDEV-36832 UPDATE...IN(SELECT) acquires spurious gap locks on FK index#4682

Open
arcivanov wants to merge 1 commit intoMariaDB:10.11from
arcivanov:MDEV-36832
Open

MDEV-36832 UPDATE...IN(SELECT) acquires spurious gap locks on FK index#4682
arcivanov wants to merge 1 commit intoMariaDB:10.11from
arcivanov:MDEV-36832

Conversation

@arcivanov
Copy link
Contributor

UPDATE ticket SET is_valid=0 WHERE ticket_id IN (SELECT ticket_id FROM ticket WHERE booking_id=1) acquires S next-key and gap locks on the FK secondary index fk_ticket_2_booking, causing deadlocks between concurrent transactions operating on non-overlapping booking_id values. The expected behavior — matching UPDATE ... WHERE ticket_id IN (1,2) (explicit PK list) — is to only lock the PRIMARY key records being updated.

Root cause: the optimizer converts the IN(SELECT) to a semi-join, which creates two table references for the same table. The SQL layer then processes the statement as SQLCOM_UPDATE_MULTI. In ha_innobase::store_lock(), the read-only (subquery) table handle receives TL_READ, but the existing LOCK_NONE optimization for DML read tables was gated on isolation_level <= READ_COMMITTED and did not include SQLCOM_UPDATE_MULTI at all. In REPEATABLE READ (the default), the subquery's secondary index scan therefore acquired LOCK_S next-key locks, including gap locks spanning into adjacent key ranges.

Fix: add SQLCOM_UPDATE_MULTI to the LOCK_NONE (consistent read) condition in store_lock():

  1. For TL_READ (binlog off or ROW format): use consistent read at all isolation levels. This is safe because:
    • The MVCC snapshot is stable within a transaction in REPEATABLE READ
    • Rows being modified still acquire X locks via the write-side table handle (F_WRLCK / LOCK_X)
    • SERIALIZABLE is handled by ::external_lock() upgrading LOCK_NONE back to LOCK_S
  2. For TL_READ_NO_INSERT (statement-based binlog): extend the existing READ COMMITTED optimization to also cover SQLCOM_UPDATE_MULTI

The fix is deliberately limited to SQLCOM_UPDATE_MULTI (not SQLCOM_UPDATE) because single-table UPDATE with scalar subqueries on other tables traditionally uses S locks to block concurrent reads of the subquery table, and broadening the change there would alter observable behavior for existing applications.

Lock count before/after (semi-join UPDATE on booking_id=1):

  • Before: 5 lock structs, 5 row locks (IS + S×3 on FK index + IX + X×2 on PK)
  • After: 2 lock structs, 2 row locks (IX + X×2 on PK) — identical to explicit WHERE ticket_id IN (1,2)

@arcivanov
Copy link
Contributor Author

@dr-m

@grooverdan
Copy link
Member

@arcivanov , while waiting on a review from the the InnoDB team on this, do you have the time/interest in looking at MDEV-38817? I'm suspecting the test failure is actually the correct result, and should have been always occurring since 10.6 when MDEV-13115 was implemented.

@arcivanov
Copy link
Contributor Author

@arcivanov , while waiting on a review from the the InnoDB team on this, do you have the time/interest in looking at MDEV-38817? I'm suspecting the test failure is actually the correct result, and should have been always occurring since 10.6 when MDEV-13115 was implemented.

I was unable to reproduce it locally but the early send_eof seems to not be appropriate for non-read-only transactions due to cleanup occurring after the packet is sent resulting in a potential race. Please see #4684.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Feb 23, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is preliminary review. Thank you for your contribution. Please stand by for the final review.

Comment on lines 16728 to 16741
if (sql_command == SQLCOM_CHECKSUM
|| sql_command == SQLCOM_CREATE_SEQUENCE
|| (sql_command == SQLCOM_ANALYZE && lock_type == TL_READ)
|| (lock_type == TL_READ
&& sql_command == SQLCOM_UPDATE_MULTI)
|| (trx->isolation_level <= TRX_ISO_READ_COMMITTED
&& (lock_type == TL_READ
|| lock_type == TL_READ_NO_INSERT)
&& (sql_command == SQLCOM_INSERT_SELECT
|| sql_command == SQLCOM_REPLACE_SELECT
|| sql_command == SQLCOM_UPDATE
|| sql_command == SQLCOM_UPDATE_MULTI
|| sql_command == SQLCOM_CREATE_SEQUENCE
|| sql_command == SQLCOM_CREATE_TABLE))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this business of downgrading the requested lock type is a hack that should have been replaced years ago. A better solution would be to introduce a lock_type value TL_READ_NO_LOCK, which would be set at the appropriate places in the SQL query execution, and have InnoDB never downgrade any TL_READ or TL_READ_NO_INSERT.

My concern is that it could be unsafe to downgrade some locks in some fragments of a complex SQL statement. But there is only a single thd_sql_command(thd) value returned for the entire statement.

Copy link
Contributor Author

@arcivanov arcivanov Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what would be a satisfactory outcome for you as far as implementation/testing/validation etc that would allow this fix to move forward (the overall historic state of this code notwithstanding)? While I understand the frustration with code quality your comment does not seem actionable 🥲

`UPDATE ticket SET is_valid=0 WHERE ticket_id IN
(SELECT ticket_id FROM ticket WHERE booking_id=1)` acquires S next-key
and gap locks on the FK secondary index `fk_ticket_2_booking`, causing
deadlocks between concurrent transactions operating on non-overlapping
`booking_id` values. The expected behavior — matching
`UPDATE ... WHERE ticket_id IN (1,2)` (explicit PK list) — is to only
lock the PRIMARY key records being updated.

**Root cause**: the optimizer converts the `IN(SELECT)` to a semi-join,
which creates two table references for the same table. The SQL layer
then processes the statement as `SQLCOM_UPDATE_MULTI`. In
`ha_innobase::store_lock()`, the read-only (subquery) table handle
receives `TL_READ`, but the existing `LOCK_NONE` optimization for DML
read tables was gated on `isolation_level <= READ_COMMITTED` and did
not include `SQLCOM_UPDATE_MULTI` at all. In `REPEATABLE READ` (the
default), the subquery's secondary index scan therefore acquired
`LOCK_S` next-key locks, including gap locks spanning into adjacent key
ranges.

**Fix**: add `SQLCOM_UPDATE_MULTI` to the `LOCK_NONE` (consistent read)
condition in `store_lock()`:

1. For `TL_READ` (binlog off or ROW format): use consistent read at
   **all** isolation levels. This is safe because:
   - The MVCC snapshot is stable within a transaction in
     `REPEATABLE READ`
   - Rows being modified still acquire X locks via the write-side
     table handle (`F_WRLCK` / `LOCK_X`)
   - `SERIALIZABLE` is handled by `::external_lock()` upgrading
     `LOCK_NONE` back to `LOCK_S`
2. For `TL_READ_NO_INSERT` (statement-based binlog): extend the
   existing `READ COMMITTED` optimization to also cover
   `SQLCOM_UPDATE_MULTI`

The fix is deliberately limited to `SQLCOM_UPDATE_MULTI` (not
`SQLCOM_UPDATE`) because single-table UPDATE with scalar subqueries on
other tables traditionally uses S locks to block concurrent reads of the
subquery table, and broadening the change there would alter observable
behavior for existing applications.

**Lock count before/after** (semi-join UPDATE on `booking_id=1`):
- Before: 5 lock structs, 5 row locks (IS + S×3 on FK index + IX +
  X×2 on PK)
- After: 2 lock structs, 2 row locks (IX + X×2 on PK) — identical to
  explicit `WHERE ticket_id IN (1,2)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants