-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: purge ephemeral storage #1729
base: master
Are you sure you want to change the base?
Conversation
b9d6979
to
9f11b37
Compare
9f11b37
to
88845e8
Compare
I will start by saying that maybe I don't have high level picture in mind, so maybe my way of viewing things is completely wrong. Would be happy to have a call and discuss this. Will provide feedback anyway so we can have more productive talk. I'm not sure we want a background task that just goes and deletes ephemeral content, because we probably want to migrate bodies and receipts into permanent storage. I also don't think that this should be driven by system clock. Instead, we should do it when new information is available (e.g. new historical summaries or new finalized block). We should probably use With that in mind, I see this store as simple structure that allows us to easily access/modify content in the db, but it shouldn't drive this logic. For example, we can expose following function and call it from the outside: /// Deletes and returns canonical content items from a slot range.
///
/// This function will also delete all content items that are associated with slots before
/// `start_slot` or that are within the range but are not canonical.
fn delete_content(
&self,
start_slot: u64,
end_slot: u64,
non_canonical_slots: &[u64],
) -> Result<Vec<(TContentKey, RawContentValue)>, ContentStoreError>; This is just an example. Depending on what information we have available, this can/should be modified to the use case (maybe even split into multiple functions). But if we don't have clear picture on how it can be used, be can just go with this and modify it as we go, or we can skip this part until we have better understanding on what is needed. |
My idea is to migrate bodies and receipts in the same task (or have a different task), but I'm adding all features incrementally, and this is just the start (only implementing purging in this PR). I will think about a system clock vs using an event, and we will discuss it (it seems like a good idea). |
Then this store would have to somehow know about permanent store as well, and I don't think it should. In that case, task should run on higher level (most likely subnet storage object), extract/delete data from ephemeral store and write (or drop) it into permanent store. Also, I think that in the long run, we want headers to expire at different point from bodies and receipts. More precisely, headers should be ephemeral for 8192 slots, while bodies and receipts should be ephemeral until they are finalized (64-96 slots). It's probably fine to handle both of these at 8192 boundary for now, but it's something we should keep in mind for the long run. |
We can have something like |
It could work. But we would have to be careful for we do it, for example, if there is some mutex, other subnetworks shouldn't be block if not needed or for potentially longer time, etc. I'm still not sure if background task using system clock is the best way for it. How would it know which content is finalized/valid and which one is not? There might be a way for this to work, but event driven approach makes more sense to me. |
Added an issue #1731 to discuss with the broader team the storage architecture. |
What was wrong?
This PR adds intelligent content purging to the
EphemeralV1Store
based on historical summaries updates in the beacon chain. It ensures that ephemeral content (headers, bodies, receipts) older than the last historical summary event is automatically purged.How was it fixed?
Added tests for:
To-Do