Skip to content

Use threadpool for async scs #5216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

morgando
Copy link
Contributor

No description provided.

Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 6/610 tests failed ⚠.

The first 10 failing tests are:
insert_lots_large_tran_generated
phys_rep_tiered_nosource_generated
sc_downgrade
unionpar_maxqueue
guid

@morgando morgando force-pushed the use_tdpool_for_async_sc branch from 42811d2 to c7fdb9a Compare June 11, 2025 17:51
arg = NULL;
rc = thdpool_enqueue(gbl_sc_thdpool, (thdpool_work_fn) do_schema_change_locked_thdpool_wrapper, s, 0, NULL, THDPOOL_FORCE_QUEUE);
} else {
rc = thdpool_enqueue(gbl_sc_thdpool, (thdpool_work_fn) do_schema_change_tran_thd_thdpool_wrapper, arg, 0, NULL, THDPOOL_FORCE_QUEUE);
Copy link
Contributor Author

@morgando morgando Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code waits here for s->started after spawning a thread to ensure the SC had actually begun before proceeding. Now, since work may sit in the pool’s queue, we would block for a long time if we waited for the sc to start. This isn't acceptable because it would block recovery.

It is necessary to make sure that the sc has started before this code runs, so I added the waiting block there. This wait is acceptable because it happens in another thread that doesn't block the master from coming up.

I'm not sure if this is the only reason why the original waiting code was here, so maybe I'm missing something important.

@morgando morgando force-pushed the use_tdpool_for_async_sc branch from c7fdb9a to 90caa1a Compare June 12, 2025 21:04
@morgando morgando force-pushed the use_tdpool_for_async_sc branch from 90caa1a to eb301fe Compare June 12, 2025 21:06
Pthread_cond_wait(&sc->condStart, &sc->mtxStart);
}
Pthread_mutex_unlock(&sc->mtxStart);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wait is acceptable: it doesn't block the master from coming up since it happens in another thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are the master here, this code runs in toblock.c

@morgando morgando marked this pull request as ready for review June 17, 2025 20:13
Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 9/613 tests failed ⚠.

The first 10 failing tests are:
remotecreate
blkseq_snapiso_generated
pmux_sqlite_file_generated
unionpar_maxqueue
phys_rep_tiered
sc_lotsoftables_logicalsc_generated
sc_lotsoftables
selectv_rcode_serialretry_generated

Signed-off-by: mdouglas47 <[email protected]>
Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 204/613 tests failed ⚠.

The first 10 failing tests are:
verify_upgrade [setup failure]
tmptbl_leak_zeropool_generated [setup failure]
strict_dbl_quotes [setup failure]
sc_parallel [setup failure]
init_sc_race [setup failure]
writes_remsql_rte_connect_generated [setup failure]
insert_lots_large_tran_generated
sc_transactional_rowlocks_generated
reco-ddlk-sql
analyze

Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 1/613 tests failed ⚠.

The first 10 failing tests are:
remotecreate

Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 5/613 tests failed ⚠.

The first 10 failing tests are:
sc_inserts_deletes_logicalsc_generated
timepart_retention1
sc_repeated_updates
fdb_compat_rte_connect_generated
sc_lotsoftables_logicalsc_generated

Copy link

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style check: Error. ⚠.
Smoke testing: Success ✓.
Cbuild submission: Success ✓.
Regression testing: 7/613 tests failed ⚠.

The first 10 failing tests are:
unionpar_maxqueue
sc_transactional_rowlocks_generated
sc_lotsoftables_logicalsc_generated
sc_lotsoftables
fdb_compat_rte_connect_generated
rebuild_table_options
guid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants