Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10.5 MDEV 29293 teemu #308

Open
wants to merge 13 commits into
base: 10.5
Choose a base branch
from
Open

10.5 MDEV 29293 teemu #308

wants to merge 13 commits into from

Conversation

temeo
Copy link

@temeo temeo commented Apr 12, 2023

  • MDEV-29293 MariaDB stuck on starting commit state
  • new checkout of wsrep-lib/KILL_command branch
  • Simplifying the patch in kill_one_thread()
  • Reorganize locking/unlocking to happen in the same scope

sjaakola and others added 5 commits April 2, 2023 17:56
The problem seems to be a deadlock between KILL command execution
and BF abort issued by an applier, where:
* KILL has locked victim's LOCK_thd_kill and LOCK_thd_data
* applier has innodb side global lock mutex and victim trx mutex
* KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex
* applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill

The fix in this commit, removes the TOI replication of KILL command,
and makes KILL execution less intrusive operation.
Aborting of the victim happens now by using wsrep_abort_thd(),
which is the same method as used for aborting victims of DDL execution.
wsrep_thd_abort(), will start the victim aborting from inside innodb,
holding the lock_sys mutex and victim trx mutex.
Therefore the locking protocol is same as used in regular applier BF aborting procedure.
wsrep_abort_thd will eventually call also THD::awake (as regular KILL would),
and now awake is passed the user chosen kill signal, in case of KILL command execution.
Applier BF aborting, otoh, will use KILL_QUERY_HARD signal.

Notable changes in this commit:
* wsrep client connections's error state may remain sticky after client connection is closed.
This error message will then pop up for the next client session issuing first SQL statement.
This problem raised with test galera.galera_bf_kill
The fix is to reset wsrep client error state, before a THD is reused for next connetion

* Releasing THD locks, in wsrep_abort_transaction, when locking innodb mutexes,
this guarantees same locking order as with applier BF aborting

* Handling BF aborting of idle victim of KILL QUERY (and lower signals) with
background rollbacker.
Kill signals higher than KILL_CONNECTION, otoh, will now skip background rollbacker treatment.
This is because KILL_CONNECTION will wake up the victim so early, that victim execution may
interfere with the rollbacker execution.

* wsrep-lib is now using new branch: KILL_command, which has changed server_service::background_rollback()
to return true/false depending on if the background rollbacking was started or not.

* Avoiding to overwrite victim THD's error code to deadlock error,
if aborting was due to manual KILL, this preserves the native error code for KILL victims
In order to make things more manageable, changed the code so
that the locking and unlocking happens in the same visible scope.
This is to allow mutex locking order LOCK_thd_data -> trx mutex,
which is needed to avoid a race in wsrep_abort_transaction().

The assumption is that lock_sys.mutex is enough to prevent
the victim to change its state.

/* Temporarily release victim_trx mutex to allow locking order
LOCK_thd_data -> trx_mutex_exit() needed in wsrep_abort_transaction() */
trx_mutex_exit(victim_trx);
DEBUG_SYNC(bf_thd, "wsrep_before_BF_victim_lock");
wsrep_thd_LOCK(thd);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is safe to use same victim_trx pointer after you have released trx mutex, you might need to get it from thd again after you have done wsrep_thd_LOCK and if it is not there bail out.


/* Unlock temporarily to grab mutexes in the right order. */
/* Unlock temporarily to grab mutexes in the right order.
Note that victim thd is protected with THD::LOCK_thd_kill here. */
wsrep_thd_UNLOCK(victim_thd);
lock_mutex_enter();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also problematic as we have released victim thread THD mutexes it can be disconnected or destroyed.

Copy link
Author

@temeo temeo Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now fixed below by find_thread_by_id() and checking that the victim transaction ID is still the same.

Some codepaths require more fine grained locking, and unlocking
LOCK_thd_kill from wsrep_thd_UNLOCK() might cause unpleasant
surprise.

Added explicit calls to wsrep_thd_kill_LOCK/UNLOCK where needed.
Locking order for BF codepaths is now

LOCK_thd_kill (can be omitted if call to awake_no_mutex is not needed)
lock_sys.mutex
LOCK_thd_data
trx mutex
lock_sys.mutex
LOCK_thd_kill
LOCK_thd_data
trx.mutex
ha_abort_transaction() return without thd mutexes held.

Add SR worker THDs to server_threads list so they can
be found via find_thread_by_id() for BF aborting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants