-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
MDEV-31949 III. Innodb flush_log_later for XA commit,rollback #4054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
andrelkin
wants to merge
9
commits into
main
Choose a base branch
from
bb-12.1-MDEV-31949_part_III
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…replication/recovery This commit is the part I of the series of four that addresses MDEV-31949 in two main directions which are xa parallel slave performance and xa transaction crash-recovery. This one improves upon MDEV-742 design's XA binlogging to facilitate to the crash-recovery (the actual binlog-based recovery is coming in the part IV of MDEV-33168 et al). With the refactoring changes, when binlog is ON, handling of execution of a XA transaction, including binlogging, is made conceptually uniform with the normal BEGIN-COMMIT transaction. That is At XA-PREPARE the transaction first is prepared in engines and after that accumulated replication events are written to the binary log, naturally without any completion event as it's unknown yet. When later XA-"COMPLETE" that is XA-COMMIT and XA-PREPARE follows up, the binary logging of respective Query event takes place first. One can perceive such scheme as if a normal transaction logging is split in the middle into two parts (and nothing really happens in between of them but time passed by). And after the second chunk is sent to binlog the transaction gets committed (or rolled back) in engine. With binlog is enabled both phases' loggings go through binlog-group-commit, where XA-PREPARE "sub-transaction" merely groups for binary logging so skips the engine action while XA-"COMPLETE" does both, that is the logging and an ordered "complete". This behavior is also consistent between completions from the native and external connections. Being a participant of binlog-group-commit designates either XA phase is recoverable (not implemented here) from active binlogs determined by binlog-checkpoint. For the latter specifically this patch removes custom unlogging of XA-prepare group. See entry.need_unlog= 0 et al in MYSQL_BIN_LOG::write_transaction_to_binlog(). In addition to the above a corner case of engine read-only XA transaction is addressed. Previously it was streamlined with logging an empty XA-PREPARE group of binlog events concluded by XA-"COMPLETE" query-event. Now when a preparing XA transaction is found to have only read-only engine branches or none it is marked for rollback as XA_RDONLY optimization: - nothing gets logged at the prepare time an XA_RDONLY note is generated and - it's rolled back at disconnect For XA-COMPLETE to tell whether the prepare phase was logged or not the XID state object is extended with a boolean flag which is a part internal interface for recovery implementation. The flag is normally raised by XA-prepare at flushing to binlog and also at binlog recovery (will be done so it's fully implemented). Notable changes: sql/handler.cc - ha_prepare() a. is ensured to execute binlog_hton::prepare() as the last XA's branch for preparing; b. engine read-only is marked in the xid state to rollback and ER_XA_RDONLY *note* is generated. - conversely ha_rollback_trans() executes binlog hton::rollback() as first branch (the commit method was already equipped to do so) - ditto to the external completion of XA via ha_commit_or_rollback_by_xid(); the function is made a sort of recursive. It may be first be invoked on a top level to take on the binlog hton "completion" to be called from its stack once again now having is_xap_binlogged() false, so to carry out the engine commit. - xarecover_handlerton() now only simulates successful find of the user xid in binlog. sql/log.cc - binlog_commit,rollback() et al are simplified and cleaned up (like binlog_complete_by_xid() introduction). In particular binlog_{commit,rollback}() are rendered to retain just a single piece of XA footprint in either. The methods recognize naturally empty transaction caches at XA completion to proceed anyway into binlog-group-commit thickness; the binlog_commit's binlog_commit_flush_trx_cache() decides which type of transaction and which XA phase is being handled so a proper group event closure is computed. - MYSQL_BIN_LOG::trx_group_commit_with_engines() takes care to raise or drop XID::binlogged flag via xid_cache_update_xa_binlog_state(). - the new run_xa_complete_ordered() encapsulates XA specifics at execution of the engine ordered commit. It's defined with asserts due to MDEV-32455. It's not done as TC_LOG::member because of the scope of this work is limited. The new function mirrors the logic of the normal run_commit_ordered() in that it skips engine completion for those that lack hton::commit_ordered() in favor of doing that on the top level of ha_commit,rollback_trans(). sql/log_event_server.cc - use the transaction cache at XA-PREPARE handling (specifically when XA-END Query event is logged). sql/xa.cc - XID_cache_element extended with is-binlogged meaning flag and few rating functions added to use by binlogging and recovery (xid_cache_update_xa_binlog_state()); - trans_xa_commit,rollback() external action branches are converted into calls of a new largely common function. sql/xa.h - xid_cache_insert(XID *xid, bool is_binlogged) parameter list is extended for xa binlog recovery.
Affected binlog,main,rpl suites tests/results due to the empty XA's rollback optimization.
…_ordered .. .. otherwise it asserts expecting a non-zero value of ha_info::m_ht in the function's loop. However since it's a rollback operation such state is possible as a top-level resets the ha_info:s of transaction in a loop so some of them can get reset before binlog_rollback() is called. Such transaction that is "spawned" to binlog-and-rollback (because may contain unrollable side-effects) should not have visited TC_LOG::run_prepare_ordered() in the first place. And it did so 'cos of incorrect cache_mngr->using_xa value. This is fixed now.
MDEV-32830 base patch missed out supporting XA-COMPLETE via hton::"complete"_by_xid with multiple engines some of them are and some are not defined ordered_commit(). The base patch fails with an assert in a path that should have been unreachable in the first place. The assert path revealed a second commit-by-xid capable engine (Spider) was running commit_by_xid() as if it is ordered_commit()-type of while Spider does not define ordered_commit. For engines featuring the ordered-commit undefined and commit-by-xid defined commit_by_xid() is considered as a general so cannot be invoked out of binlog-group-commit (run_prepare_ordered()). It would be a full commit similarly to the normal transaction case be invoked after binlog-group-commit engine loop is done^\footnote{% also specified in p.5 low-level design points of MDEV-32830 description}. For that this patch - extends commit_or_rollback_xa_engine() with a hint for xarollback_handlerton() argument that enables to understand which phase of XA completion it's being run. - elaborates in xacommit_handlerton(). Specifically to fix this bug, not invoke a defined hton->complete_by_xid when the method is to run as ordered-"complete" by engine lacking the ordered_commit.
…lling back.. XA in XA_ROLLBACK_ONLY state. The assert was too strict having not anticipated to face a XA-PREPARE ending in rolling back. That's a correct behaviour of read-only disconnecting XA. Fixed to relax the assert.
A ha_rollback_trans() DBUG_ASSERT(thd->lex->sql_command != SQLCOM_XA_ROLLBACK || thd->transaction->xid_state.is_explicit_XA()) expected an XA sql_command match a XID state. However Spider engine can visit this function even when Spider is not the xa transaction participant (sic!) through XA ROLLBACK 'xid' from an external connection. The assert is refined, at least temporarily to satisfy MDEV-32830 testing, to let the function run an internal Spider's transaction.
CACHE INDEX t PARTITION after XA-PREPARE also this a line previously assert for another "exceptional" sql command: ... This is really a variety of MDEV-32455. The assert did not expect another use case of implicit rollback with XA transaction while the sql command should have been (I think) rejected with XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state The assert is refined and its another instance removed being redundant.
An observed assert in rpl_group_info::unmark_start_commit DBUG_ASSERT(!gco->next_gco || gco->next_gco->wait_count > e->count_committing_event_groups || allow_unmark_after_complete); was overreaction. The reason that execution got to that point was MDEV-33921 underfixed that issue for the slave parallel mode. That's done now. The former test is extended to hit the assert by the (buggy) BASE of this commit.
This commit carries changes to Innodb for the user XA's binlog-coordinated ordered commit. It's the part III of the series aimed at improving on parallel slave performance and providing crash-recovery. As the part I has unified binlog coordinated completion (commit and rollback) of the XA transactions with the normal ones a remained task is to optimize flush to disk call away from the XA completion execution in engine like it's done to the normal transaction. Namely innobase_{rollback,commit}_by_xid()^\footnote{Unlike the normal transaction case, XA-rollback also needs a similar addressing} now raise a flag blocking the flush before innobase_commit_low(). The flag is restored upon that, which of course what the normal transaction does in its execution path past the low-commit. Tests demonstrate the changes are effectual, e.g binlog_xa_prepared_disconnect.test " failed ", to require it to adapted to possible loss of the commit/rollback upon crash. Binlog recovery in such scenarios is going to be recovered in the part IV of MDEV-33168. The feasibility of successful recovery of XA-"COMPLETE" is based on a plain observation, that in order to decide whether to commit an XA or not it's sufficient to follow the normal transaction recovery rule: when xid is found in binlog the transaction gets completed (committed in the normal transaction case).
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit carries changes to Innodb for the user XA's binlog-coordinated
ordered commit. It's the part III of the series aimed at
improving on parallel slave performance and providing crash-recovery.
As the part I has unified binlog coordinated completion (commit and
rollback) of the XA transactions with the normal ones a remained
task is to optimize flush to disk call away from the XA completion execution
in engine like it's done to the normal transaction.
Namely innobase_{rollback,commit}_by_xid()^\footnote{Unlike the normal
transaction case, XA-rollback also needs a similar addressing} now
raise a flag blocking the flush before innobase_commit_low().
The flag is restored upon that, which of course what the normal
transaction does in its execution path past the low-commit.
Tests demonstrate the changes are effectual, e.g
binlog_xa_prepared_disconnect.test " failed ", to require it to
adapted to possible loss of the commit/rollback upon crash.
The actual binlog recovery in such scenarios is only going to be recovered in the part
IV of MDEV-33168.
Feasibility of successful recovery of
XA-"COMPLETE" is based on a plain observation, that in order to decide
whether to commit an XA or not it's sufficient to follow
the normal transaction recovery rule: when
xid
is found in binlog (that's MDEV-33168)the transaction gets completed (committed in the normal transaction case).