MDEV-37977 InnoDB deadlock report incorrectly reports rolled back transaction number#4679
MDEV-37977 InnoDB deadlock report incorrectly reports rolled back transaction number#4679arcivanov wants to merge 1 commit intoMariaDB:10.11from
Conversation
a0b993a to
0971dec
Compare
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution. This is a preliminary review. Please stand by for the final review.
dr-m
left a comment
There was a problem hiding this comment.
I’d prefer descriptive test case names. Could one of the existing tests innodb.deadlock_detect or innodb.deadlock_victim_race be extended with this?
| --connection con1 | ||
| --reap | ||
| ROLLBACK; | ||
|
|
||
| --connection default | ||
| let SEARCH_FILE= $MYSQLTEST_VARDIR/log/mysqld.1.err; | ||
| let SEARCH_PATTERN= WE ROLL BACK TRANSACTION \(1\); | ||
| --source include/search_pattern_in_file.inc | ||
|
|
||
| --disconnect con1 | ||
| SET GLOBAL innodb_print_all_deadlocks= @save_print_all_deadlocks; | ||
| DROP TABLE t1; | ||
| --source include/wait_until_count_sessions.inc |
There was a problem hiding this comment.
We could move the --disconnect con1 right after the ROLLBACK, to reduce the wait time at the end.
There was a problem hiding this comment.
So would you prefer a descriptive name with a standalone test or integration into an existing one?
| if ((next_weight|next_not_pref) < (victim_weight|victim_not_pref)) | ||
| if (UNIV_UNLIKELY(!victim) || | ||
| (next_weight|next_not_pref) < (victim_weight|victim_not_pref)) | ||
| { | ||
| victim_weight= next_weight; | ||
| victim= next; | ||
| victim_pos= l; | ||
| } |
There was a problem hiding this comment.
I don’t fully understand this. We would seem to have victim=nullptr only during the first iteration. In the general case, the path to the deadlock could be like the letter P, comprising an acyclic sequence of M steps and a cycle of N steps. If I understood it correctly, this code assumes M=1. The test case is for N=2; in a more general case we would have a deadlock among several transactions.
There was a problem hiding this comment.
Originally here the victim is always initialized to the first element returned by find_cycle:
But because we shifted our starting point to zero and victim to nullptr we need to populate the victim = next (line 7002/7011) and related weight and pos to the first available value unconditionally if the victim has not yet been initialized.
0971dec to
a6627de
Compare
…nsaction number The "WE ROLL BACK TRANSACTION (N)" message in the deadlock report referred to the wrong transaction number. The victim selection loop and the display loop in `Deadlock::report()` traversed the cycle in the same order (`cycle->wait_trx, ..., cycle`) but used misaligned position numbering: - **Victim selection** initialized `victim = cycle` at position 1 *before* the loop, then started iterating from `cycle->wait_trx` at position 2. - **Display loop** started from `cycle->wait_trx` at label `(1)`, with `cycle` displayed last at label `(N)`. This caused `victim_pos` to be off by one relative to the displayed transaction labels. Fix: restructure the victim selection loop to start with `l=0` and `victim=nullptr`, letting the loop handle all transactions uniformly. The first iteration unconditionally picks `cycle->wait_trx` as the initial victim at position 1, matching the display loop. The `thd_deadlock_victim_preference()` call is guarded with a `victim != nullptr` check to skip it on the first iteration (where no prior victim exists to compare against).
a6627de to
8ea573c
Compare
The "WE ROLL BACK TRANSACTION (N)" message in the deadlock report referred to the wrong transaction number. The victim selection loop and the display loop in
Deadlock::report()traversed the cycle in the same order (cycle->wait_trx, ..., cycle) but used misaligned position numbering:victim = cycleat position 1 before the loop, then started iterating fromcycle->wait_trxat position 2.cycle->wait_trxat label(1), withcycledisplayed last at label(N).This caused
victim_posto be off by one relative to the displayed transaction labels.Fix: restructure the victim selection loop to start with
l=0andvictim=nullptr, letting the loop handle all transactions uniformly. The first iteration unconditionally pickscycle->wait_trxas the initial victim at position 1, matching the display loop. Thethd_deadlock_victim_preference()call is guarded with avictim != nullptrcheck to skip it on the first iteration (where no prior victim exists to compare against).