Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a OnManualFlushScheduled callback in event listener #12631

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jowlyzhang
Copy link
Contributor

@jowlyzhang jowlyzhang commented May 9, 2024

As titled. Also added the newest user-defined timestamp into the MemTableInfo. This can be a useful info in the callback.

Added some unit tests as examples for how users can use two separate approaches to allow manual flush / manual compactions to go through when the user-defined timestamps in memtable only feature is enabled. One approach relies on selectively increase cutoff timestamp in OnMemtableSeal callback when it's initiated by a manual flush. Another approach is to increase cutoff timestamp in OnManualFlushScheduled callback. The caveats of the approaches are also documented in the unit test.

@jowlyzhang jowlyzhang marked this pull request as draft May 9, 2024 16:53
@jowlyzhang jowlyzhang changed the title Add newest user-defined timestamp to MemTableInfo Add a OnManualFlushScheduled callback in event listener May 9, 2024
@jowlyzhang jowlyzhang marked this pull request as ready for review May 9, 2024 20:23
@jowlyzhang jowlyzhang requested a review from ajkr May 9, 2024 20:23
@@ -2426,6 +2443,8 @@ Status DBImpl::FlushMemTable(ColumnFamilyData* cfd,
}
}
}

NotifyOnManualFlushScheduled({cfd}, flush_reason);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FlushRequest generated by FlushMemTable() specifies std::numeric_limits<uint64_t>::max() as the max memtable ID to persist. Is it possible the background flush includes memtables newer than the one in memtable_ids_to_wait? That would mean increasing the cutoff TS here would not guarantee the flush can happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. Yes, if another memtable is sealed in between the IncreaseFullHistoryTsLow here and the time background flush checks for whether timestamps expired. It could mean that the cutoff TS here is not sufficient to guarantee the flush can happen.

If that sealed memtable is caused by another manual flush type of event, this call back should have also be invoked to increase the cutoff TS to a higher point. If that sealed memtable is caused by regular writes filling up a memtable, this would be an issue, like when the write rate is very high.

I think updating the memtable id in FlushRequest to be GetLatestMemTableID instead of std::numeric_limits<uint64_t>::max() can help make sure the flush can still proceed in this case. Do you have any concerns for making this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants