Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hudi-9041]send act commit event when reuse current instant #12849

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

gejinzh
Copy link

@gejinzh gejinzh commented Feb 18, 2025

…stant that recovered from checkpoint

Change Logs

Send commit ack event when reuse instant that recovered from checkpoint

Impact

Send commit ack event when reuse instant that recovered from checkpoint to unblock write tasks

Risk level (write none, low medium or high below)

low

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

…ent instant that recovered from checkpoint
@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Feb 18, 2025
@danny0405
Copy link
Contributor

cc @cshuo to take care of it.

@cshuo
Copy link
Contributor

cshuo commented Feb 19, 2025

@gejinzh thks for contributing. Please format the commit message, generally, the title should be "[HUDI-XXX] xxx". Refer to https://hudi.apache.org/contribute/developer-setup/#contributing-code for detail.

@@ -449,6 +449,10 @@ private void handleBootstrapEvent(WriteMetadataEvent event) {
LOG.warn("Reuse current pending Instant {} with {} operationType, "
+ "ignoring empty bootstrap event.", this.instant, WriteOperationType.INSERT.value());
reset();
if (event.isRestored()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, event is definitely restored from state in this branch, as this.instant is a valid instant. So it's ok to just send ack directly without checking? Correctly me if I miss something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write task will only be unblocked when creating an instant or sending an ack event;
This code can only be executed when reusing instant.
When reusing instant, it indicates that there is an inflight instant equal to this.instant and no new instant will be created;
Therefore, when the event is restored from the state, we need an ack event to unblock the write tasks;
I don't know if I have answered your question,I'm not getting the question very well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gejinzh My point is that it's unnecessary to check the condition event.isRestored(), and we can simple sendCommitAckEvents(-1L); here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no data flush in append mode, instant will also be reused here; And the coordinator will send an act event at notifyCheckpointComplete; So without this condition, two act events will be sent;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gejinzh Could you please also add another test to cover the case you described?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So without this condition, two act events will be sent;

It should be okay because they are 2 different commits.

@cshuo
Copy link
Contributor

cshuo commented Feb 19, 2025

@gejinzh BTW, it's better to add a test for the case. Here is test patch, you can apply it by git am 0001-add-test.patch locally, and update the PR.
0001-add-test.patch

@github-actions github-actions bot added size:M PR with lines of changes in (100, 300] and removed size:S PR with lines of changes in (10, 100] labels Feb 19, 2025
@gejinzh gejinzh changed the title solve the problem #12820: send act commit event when reuse current in… [hudi-9041]solve the problem #12820;send act commit event when reuse current in… Feb 20, 2025
@cshuo
Copy link
Contributor

cshuo commented Feb 20, 2025

+1. cc @danny0405

@danny0405
Copy link
Contributor

@gejinzh Can you fix the compile error:

Error:  /home/runner/work/hudi/hudi/hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/MockSubtaskGateway.java:[44,30] cannot find symbol

@gejinzh gejinzh changed the title [hudi-9041]solve the problem #12820;send act commit event when reuse current in… [hudi-9041]send act commit event when reuse current instant Feb 23, 2025
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:M PR with lines of changes in (100, 300]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants