-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-29293 Refactor MDL BF abort logic #314
base: 10.5
Are you sure you want to change the base?
Commits on Apr 2, 2023
-
MDEV-29293 MariaDB stuck on starting commit state
The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data * applier has innodb side global lock mutex and victim trx mutex * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex * applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill The fix in this commit, removes the TOI replication of KILL command, and makes KILL execution less intrusive operation. Aborting of the victim happens now by using wsrep_abort_thd(), which is the same method as used for aborting victims of DDL execution. wsrep_thd_abort(), will start the victim aborting from inside innodb, holding the lock_sys mutex and victim trx mutex. Therefore the locking protocol is same as used in regular applier BF aborting procedure. wsrep_abort_thd will eventually call also THD::awake (as regular KILL would), and now awake is passed the user chosen kill signal, in case of KILL command execution. Applier BF aborting, otoh, will use KILL_QUERY_HARD signal. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill The fix is to reset wsrep client error state, before a THD is reused for next connetion * Releasing THD locks, in wsrep_abort_transaction, when locking innodb mutexes, this guarantees same locking order as with applier BF aborting * Handling BF aborting of idle victim of KILL QUERY (and lower signals) with background rollbacker. Kill signals higher than KILL_CONNECTION, otoh, will now skip background rollbacker treatment. This is because KILL_CONNECTION will wake up the victim so early, that victim execution may interfere with the rollbacker execution. * wsrep-lib is now using new branch: KILL_command, which has changed server_service::background_rollback() to return true/false depending on if the background rollbacking was started or not. * Avoiding to overwrite victim THD's error code to deadlock error, if aborting was due to manual KILL, this preserves the native error code for KILL victims
Configuration menu - View commit details
-
Copy full SHA for efb06ef - Browse repository at this point
Copy the full SHA efb06efView commit details
Commits on Apr 3, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 09e3f40 - Browse repository at this point
Copy the full SHA 09e3f40View commit details
Commits on Apr 11, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 0b23494 - Browse repository at this point
Copy the full SHA 0b23494View commit details
Commits on Apr 14, 2023
-
MDEV-29293 Refactor MDL BF abort logic
In this commit, the BF abort logic is refactored to eliminate the need to call back to server from InnoDB wsrep_abort_transaction(). This is to simplify the MDL BF abort codepath so that the direction is always only server -> InnoDB. The main change is in wsrep_abort_thd(), which first does the BF abort through wsrep-lib, and then continues to ha_abort_transaction() if the wsrep-lib BF abort was successful and no background rollback was started for victim. KILL command handling for wsrep threads was changed so that the thread to be killed is flagged first with wsrep_abort_by_kill to avoid calling storage engine ha_kill_query() from awake_no_mutex(). KILL for wsrep thread is executed by awake_no_mutex() combined with call to ha_abort_transaction(). The locking order for wsrep BF aborts is now: lock_sys.mutex trx.mutex LOCK_thd_kill LOCK_thd_data With this, BF aborts which originate from InnoDB lock manager don't have to release lock_sys.mutex and trx.mutex in between, which reduces chance for race conditions. In order to make the above locking order work for wsrep_abort_transaction(), the function records InnoDB transaction ID and releases victim LOCK_thd_kill and LOCK_thd_data. InnoDB lock_sys.mutex is then grabbed and the victim InnoDB trx object is looked up by InnoDB transaction ID. Locking/unlocking LOCK_thd_kill and LOCK_thd_data was reorganized so that it happens in the visible scope as much as possible. Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: - Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). - Make galera_var_retry_autocommit result more readable by echoing cases and expectations into result. Only one expected result for reap to verify that server returns expected status for query. - Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. - Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. - Test galera_bf_abort_registering to check that registering trx gets BF aborted through MDL. Co-authored-by: Jan Lindström <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 50ac97f - Browse repository at this point
Copy the full SHA 50ac97fView commit details
Commits on Apr 15, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ffc7daf - Browse repository at this point
Copy the full SHA ffc7dafView commit details -
Configuration menu - View commit details
-
Copy full SHA for 28b2d12 - Browse repository at this point
Copy the full SHA 28b2d12View commit details -
Disallow KILL for wsrep transactions in committing state
When wsrep transaction has reached committing state, it must become immune for KILL. Accompanied MTR test can be used to reproduce assertion failure if the transaction is not immune. Implemented additional protection in wsrep_kill_thd() against KILL in committing states.
Configuration menu - View commit details
-
Copy full SHA for fb990d4 - Browse repository at this point
Copy the full SHA fb990d4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b6bde3 - Browse repository at this point
Copy the full SHA 6b6bde3View commit details -
Process KILL asynchronously for committing thds
If wsrep thread is on commit codepath when KILL happens, the kill_signal is saved and restored in wsrep_after_commit() so that it will be processed after commit is over.
Configuration menu - View commit details
-
Copy full SHA for 31aa218 - Browse repository at this point
Copy the full SHA 31aa218View commit details
Commits on Apr 16, 2023
-
Release trx.mutex for wsrep_thd_bf_abort()
This changes the locking order to lock_sys.mutex LOCK_thd_kill LOCK_thd_data trx.mutex Order LOCK_thd_data -> trx.mutex is required by group commit. In order to avoid trx going out of scope, grap reference to victim before releasing the mutex.
Configuration menu - View commit details
-
Copy full SHA for f6d98e7 - Browse repository at this point
Copy the full SHA f6d98e7View commit details -
Decide postponing kill over commit based on wsrep trx state
Postpone kill only if the thread is supposed to commit the transaction. For transactions which will roll back, keep the killed state. Added THD::wsrep_killed_state for debugging.
Configuration menu - View commit details
-
Copy full SHA for 095a528 - Browse repository at this point
Copy the full SHA 095a528View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e60023 - Browse repository at this point
Copy the full SHA 6e60023View commit details -
Break the cycle with BF abort and group commit
There is a cycle trx reference -> LOCK_commit_order -> LOCK_thd_data -> trx reference which may prevent the victim transaction committing while BF aborter is trying to lock LOCK_thd_data. Break this cycle by using wsrep_thd_TRYLOCK() and checking if the victim is already committed in memory if the lock cannot be taken. Locking will be retried until the lock gets taken or the victim is found committed in memory.
Configuration menu - View commit details
-
Copy full SHA for b1b4f11 - Browse repository at this point
Copy the full SHA b1b4f11View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2ad2ed1 - Browse repository at this point
Copy the full SHA 2ad2ed1View commit details -
Configuration menu - View commit details
-
Copy full SHA for f8acc3b - Browse repository at this point
Copy the full SHA f8acc3bView commit details
Commits on Apr 17, 2023
-
Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h
The manipulation of the variable can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. Also fix THD::reset_killed() to check if wsrep is enabled for THD.
Configuration menu - View commit details
-
Copy full SHA for 845ba0e - Browse repository at this point
Copy the full SHA 845ba0eView commit details -
Move restoring kill state to wsrep_after_statement()
This is to ensure that killed state will be restored before statement execution is over.
Configuration menu - View commit details
-
Copy full SHA for 1f016b8 - Browse repository at this point
Copy the full SHA 1f016b8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6947855 - Browse repository at this point
Copy the full SHA 6947855View commit details -
Remove check for binlog group commits from galera_kill_group_commit
In rare case of false negative, checking the counter would cause test failure-
Configuration menu - View commit details
-
Copy full SHA for 580ba32 - Browse repository at this point
Copy the full SHA 580ba32View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7d1aa2a - Browse repository at this point
Copy the full SHA 7d1aa2aView commit details