MDEV-36832 UPDATE...IN(SELECT) acquires spurious gap locks on FK index#4682
MDEV-36832 UPDATE...IN(SELECT) acquires spurious gap locks on FK index#4682arcivanov wants to merge 1 commit intoMariaDB:10.11from
Conversation
f5dceb2 to
fc840a3
Compare
|
@arcivanov , while waiting on a review from the the InnoDB team on this, do you have the time/interest in looking at MDEV-38817? I'm suspecting the test failure is actually the correct result, and should have been always occurring since 10.6 when MDEV-13115 was implemented. |
I was unable to reproduce it locally but the early send_eof seems to not be appropriate for non-read-only transactions due to cleanup occurring after the packet is sent resulting in a potential race. Please see #4684. |
gkodinov
left a comment
There was a problem hiding this comment.
This is preliminary review. Thank you for your contribution. Please stand by for the final review.
| if (sql_command == SQLCOM_CHECKSUM | ||
| || sql_command == SQLCOM_CREATE_SEQUENCE | ||
| || (sql_command == SQLCOM_ANALYZE && lock_type == TL_READ) | ||
| || (lock_type == TL_READ | ||
| && sql_command == SQLCOM_UPDATE_MULTI) | ||
| || (trx->isolation_level <= TRX_ISO_READ_COMMITTED | ||
| && (lock_type == TL_READ | ||
| || lock_type == TL_READ_NO_INSERT) | ||
| && (sql_command == SQLCOM_INSERT_SELECT | ||
| || sql_command == SQLCOM_REPLACE_SELECT | ||
| || sql_command == SQLCOM_UPDATE | ||
| || sql_command == SQLCOM_UPDATE_MULTI | ||
| || sql_command == SQLCOM_CREATE_SEQUENCE | ||
| || sql_command == SQLCOM_CREATE_TABLE))) { |
There was a problem hiding this comment.
As far as I can tell, this business of downgrading the requested lock type is a hack that should have been replaced years ago. A better solution would be to introduce a lock_type value TL_READ_NO_LOCK, which would be set at the appropriate places in the SQL query execution, and have InnoDB never downgrade any TL_READ or TL_READ_NO_INSERT.
My concern is that it could be unsafe to downgrade some locks in some fragments of a complex SQL statement. But there is only a single thd_sql_command(thd) value returned for the entire statement.
There was a problem hiding this comment.
So what would be a satisfactory outcome for you as far as implementation/testing/validation etc that would allow this fix to move forward (the overall historic state of this code notwithstanding)? While I understand the frustration with code quality your comment does not seem actionable 🥲
`UPDATE ticket SET is_valid=0 WHERE ticket_id IN
(SELECT ticket_id FROM ticket WHERE booking_id=1)` acquires S next-key
and gap locks on the FK secondary index `fk_ticket_2_booking`, causing
deadlocks between concurrent transactions operating on non-overlapping
`booking_id` values. The expected behavior — matching
`UPDATE ... WHERE ticket_id IN (1,2)` (explicit PK list) — is to only
lock the PRIMARY key records being updated.
**Root cause**: the optimizer converts the `IN(SELECT)` to a semi-join,
which creates two table references for the same table. The SQL layer
then processes the statement as `SQLCOM_UPDATE_MULTI`. In
`ha_innobase::store_lock()`, the read-only (subquery) table handle
receives `TL_READ`, but the existing `LOCK_NONE` optimization for DML
read tables was gated on `isolation_level <= READ_COMMITTED` and did
not include `SQLCOM_UPDATE_MULTI` at all. In `REPEATABLE READ` (the
default), the subquery's secondary index scan therefore acquired
`LOCK_S` next-key locks, including gap locks spanning into adjacent key
ranges.
**Fix**: add `SQLCOM_UPDATE_MULTI` to the `LOCK_NONE` (consistent read)
condition in `store_lock()`:
1. For `TL_READ` (binlog off or ROW format): use consistent read at
**all** isolation levels. This is safe because:
- The MVCC snapshot is stable within a transaction in
`REPEATABLE READ`
- Rows being modified still acquire X locks via the write-side
table handle (`F_WRLCK` / `LOCK_X`)
- `SERIALIZABLE` is handled by `::external_lock()` upgrading
`LOCK_NONE` back to `LOCK_S`
2. For `TL_READ_NO_INSERT` (statement-based binlog): extend the
existing `READ COMMITTED` optimization to also cover
`SQLCOM_UPDATE_MULTI`
The fix is deliberately limited to `SQLCOM_UPDATE_MULTI` (not
`SQLCOM_UPDATE`) because single-table UPDATE with scalar subqueries on
other tables traditionally uses S locks to block concurrent reads of the
subquery table, and broadening the change there would alter observable
behavior for existing applications.
**Lock count before/after** (semi-join UPDATE on `booking_id=1`):
- Before: 5 lock structs, 5 row locks (IS + S×3 on FK index + IX +
X×2 on PK)
- After: 2 lock structs, 2 row locks (IX + X×2 on PK) — identical to
explicit `WHERE ticket_id IN (1,2)`
fc840a3 to
2439ca0
Compare
UPDATE ticket SET is_valid=0 WHERE ticket_id IN (SELECT ticket_id FROM ticket WHERE booking_id=1)acquires S next-key and gap locks on the FK secondary indexfk_ticket_2_booking, causing deadlocks between concurrent transactions operating on non-overlappingbooking_idvalues. The expected behavior — matchingUPDATE ... WHERE ticket_id IN (1,2)(explicit PK list) — is to only lock the PRIMARY key records being updated.Root cause: the optimizer converts the
IN(SELECT)to a semi-join, which creates two table references for the same table. The SQL layer then processes the statement asSQLCOM_UPDATE_MULTI. Inha_innobase::store_lock(), the read-only (subquery) table handle receivesTL_READ, but the existingLOCK_NONEoptimization for DML read tables was gated onisolation_level <= READ_COMMITTEDand did not includeSQLCOM_UPDATE_MULTIat all. InREPEATABLE READ(the default), the subquery's secondary index scan therefore acquiredLOCK_Snext-key locks, including gap locks spanning into adjacent key ranges.Fix: add
SQLCOM_UPDATE_MULTIto theLOCK_NONE(consistent read) condition instore_lock():TL_READ(binlog off or ROW format): use consistent read at all isolation levels. This is safe because:REPEATABLE READF_WRLCK/LOCK_X)SERIALIZABLEis handled by::external_lock()upgradingLOCK_NONEback toLOCK_STL_READ_NO_INSERT(statement-based binlog): extend the existingREAD COMMITTEDoptimization to also coverSQLCOM_UPDATE_MULTIThe fix is deliberately limited to
SQLCOM_UPDATE_MULTI(notSQLCOM_UPDATE) because single-table UPDATE with scalar subqueries on other tables traditionally uses S locks to block concurrent reads of the subquery table, and broadening the change there would alter observable behavior for existing applications.Lock count before/after (semi-join UPDATE on
booking_id=1):WHERE ticket_id IN (1,2)