When testing Master failover and reattaching failed master getting ReadTopologyInstance error #284

leeparayno · 2016-11-03T21:36:09Z

I have 3 Percona MySQL 5.6.29-76.2-log instances in separate VirtualBox VMs running CentOS 7.0.

The prior replication configuration was:

mysql56b-2
+ mysql56b-1
+ mysql56b-3

Upon blocking 3306 on mysql56b-2, failover initiated and mysql56b-1 was made a slave of mysql56b-3. mysql56b-2 was not showing up in the topology shown on the Orchestrator UI any longer in the "mysql56b-2 cluster".

I was attempting to allow the old master (mysql56b-2) rejoin the cluster.

I set gtid_purged to the values that mysql56b-2 was showing in gtid_executed.

After repointing to mysql56b-3 using the change master command, for some reason, it appeared that replication was attempting to run through the existing transactions that had already been run.

For all the UUID/GTID combinations in the gtid_execute, I created empty transactions and set the gtid_next to the maximum transaction value for each UUID that was already showing had already been executed by that slave. So this should essentially make it ready to connect to the new master and retrieve any new transactions as necessary and catch up to the other replicas and new master.

However, Orchestrator was stuck with this error:

ERROR ReadTopologyInstance(mysql56b-2:3306) show global status like 'Uptime': Error 1837: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is 'd1da7519-fdb9-11e5-8407-08002720ea52:111'.

On mysql56b-2 was showing gtid_next as 'AUTOMATIC', gtid_purged as the current set of transactions:

3d83956c-e8a3-11e5-ba83-080027da8259:1-5,
743902dd-97cf-11e6-b0c9-080027a97f61:1-9,
d1da7519-fdb9-11e5-8407-08002720ea52:1-11

Note: I tried a few failovers to different nodes and ran transactions, which is why there are received/executed transactions from each of the 3 nodes.

mysql56b-2 was showing now issues with replication in "show slave status" and all appeared to be in sync after reattaching to mysql56b-3.

I couldn't get Orchestrator to refresh the current state until I "forgot" mysql56b-2 and restarted Orchestrator to let it be rediscovered.

The text was updated successfully, but these errors were encountered:

shlomi-noach · 2016-11-07T13:47:02Z

I'm not sure I understand if this is an orchestrator problem or a GTID problem. You say orchestrator said:

ERROR ReadTopologyInstance(mysql56b-2:3306) show global status like 'Uptime': Error 1837: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is 'd1da7519-fdb9-11e5-8407-08002720ea52:111'.

and then that message only went away when you forgot and rediscovered the host? Or were there further steps in between?

s/mysql56b-2 was showing now issues/mysql56b-2 was showing no issues/g -- correct?

leeparayno · 2016-11-07T19:13:50Z

After I reassigned the old master back as a slave of the new master, I originally got this error in SHOW SLAVE STATUS, but fixed the replication issue by creating empty transactions for all the transactions that had already been executed.

So at the time I was still seeing the ReadTopologyInstance errors, the show slave status on mysql56b-2 was no longer showing any issues.

On Nov 7, 2016, at 5:47 AM, Shlomi Noach [email protected] wrote:

I'm not sure I understand if this is an orchestrator problem or a GTID problem. You say orchestrator said:
ERROR ReadTopologyInstance(mysql56b-2:3306) show global status like 'Uptime': Error 1837: When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is 'd1da7519-fdb9-11e5-8407-08002720ea52:111'.
and then that message only went away when you forgot and rediscovered the host? Or were there further steps in between?

s/mysql56b-2 was showing now issues/mysql56b-2 was showing no issues/g -- correct?

Correct, there were no more issues with replication.

This makes it look like Orchestrator was caching a previous error and maintaining that state.

You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#284 (comment)

Lee Parayno

shlomi-noach · 2016-11-07T20:39:46Z

Thank you. I've never witnessed this kind of behavior before. I will do some digging.

shlomi-noach · 2016-11-14T14:04:40Z

Looking slightly more into this, a couple more questions:

I assume you saw this error on the orchestrator log, correct? And likely this also showed at the GUI's instance dialog?
Other than this error showing up, did orchestrator fail to read the instance? To show the topology?

leeparayno · 2016-11-14T18:29:30Z

Yes it was in the orchestrator log.

In the GUI, it was reporting the old replication error on the instance. It looked like orchestrator was failing to read the instance’s current state, as the topology was updated to the show the new position as a slave of the new master, but was not showing as replicating correctly.

Lee Parayno

On Nov 14, 2016, at 6:04 AM, Shlomi Noach <[email protected] mailto:[email protected]> wrote:

Looking slightly more into this, a couple more questions:

I assume you saw this error on the orchestrator log, correct? And likely this also showed at the GUI's instance dialog?

Other than this error showing up, did orchestrator fail to read the instance? To show the topology?

You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#284 (comment) #284 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When testing Master failover and reattaching failed master getting ReadTopologyInstance error #284

When testing Master failover and reattaching failed master getting ReadTopologyInstance error #284

leeparayno commented Nov 3, 2016 •

edited by shlomi-noach

Loading

shlomi-noach commented Nov 7, 2016

leeparayno commented Nov 7, 2016

shlomi-noach commented Nov 7, 2016

shlomi-noach commented Nov 14, 2016

leeparayno commented Nov 14, 2016

When testing Master failover and reattaching failed master getting ReadTopologyInstance error #284

When testing Master failover and reattaching failed master getting ReadTopologyInstance error #284

Comments

leeparayno commented Nov 3, 2016 • edited by shlomi-noach Loading

shlomi-noach commented Nov 7, 2016

leeparayno commented Nov 7, 2016

shlomi-noach commented Nov 7, 2016

shlomi-noach commented Nov 14, 2016

leeparayno commented Nov 14, 2016

leeparayno commented Nov 3, 2016 •

edited by shlomi-noach

Loading