WIP fscache: blob gc feature for fscache with inuse and cull #518

yawqi · 2022-06-21T01:39:33Z

This patch is still working in progress, hasn't been tested at all, very likely to have bugs or even errors.
Does anyone have suggestions on how can I test it?

Add blob gc for fscache by using inuse and cull,
this implementation reference the implementation of the cachefilesd,
but since we use ondemand mode, so it's sightly different.

The gc watermark for now is 95% used blocks or 95% used files,
in the future, we need to change it configurable.

#454

Signed-off-by: Qi Wang [email protected]

jiangliu · 2022-06-21T02:37:00Z

When reclaiming the data blob files, we should also reclaim the associated chunk map state file.
Actually we should reclaim the chunk map state file before reclaim the blob file.
Otherwise it may cause inconsistent state when the blob is used again.

yawqi · 2022-06-21T02:52:43Z

If we only reclaim unused blob, is it necessary to reclaim the chunk map state? Does the chunk map state contains unused blob?
The cull of cachefile will only cull the unused blob.

/*
 * Cull an object if it's not in use
 * - called only by cache manager daemon
*/
int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, char *filename)

yqleng1987 · 2022-06-21T04:09:43Z

@yawqi , Your pull request has been updated. A new test job has been submitted. Please wait in patience.

yqleng1987 · 2022-06-21T04:09:47Z

@yawqi , Test job has been submitted. Please wait in patience.

yqleng1987 · 2022-06-21T04:21:49Z

@yawqi , The CI test is completed, please check result:

Test Case		Test Result
merge-target-branch		✅SUCCESS
build-docker-image		✅SUCCESS
compile-nydus		✅SUCCESS
compile-ctr-remote		✅SUCCESS
compile-nydus-snapshotter		✅SUCCESS
start-nydus-snapshotter-config-containerd		✅SUCCESS
run-container-with-nydus-image		✅SUCCESS

Congratulations, your test job passed!

jiangliu · 2022-06-21T05:32:10Z

The chunk map is used to track chunk readiness for blob files. If the stale chunk map if left over and when the blob is used again, the fscache driver will send READ request as expected, but the nydusd won't fetch data from remote and write data to the new blob file because the chunk map file says all chunks are ready.

changweige · 2022-06-21T06:27:17Z

Filesystem atime could never be changed per as mount option. Will it affect your design philosophy?

yawqi · 2022-06-21T07:16:56Z

IMHO, It won't. I use atime as a rough estimate, don't depend on its accuracy. Such as outdated image's unused blob must have the oldest atime, so we can delete it first. For other files, I think the atime can partly reflect whether it is in use or when it has been used. And in most sceneries, mount won't disable atime (I guess...) : ).

yawqi · 2022-06-21T08:54:06Z

OK, I will fix it first.

yawqi · 2022-06-22T07:33:11Z

Sorry to bother you, but I wonder whether it is ok if I just delete the corresponding ${work_dir}/${blob_id}.chunk_map file? Thanks a lot!

yawqi · 2022-06-22T08:19:59Z

Actually, Should I just delete the ${blob_id}, ${blob_id}.chunk_map and ${blob_id}.blob.meta all together?

yawqi · 2022-06-22T09:12:29Z

I think I'm gonna add a gc_file(blob_id) method in trait BlobCacheMgr or struct FsCacheMgr to actually delete the chunk_map file.

jiangliu · 2022-06-22T15:13:22Z

blob.meta is a const file, should be small, and affect container startup time. So it would be better to keep it.
For chunk_map, we need a safe way to delete to avoid race conditions.
We need to add an interface to safely delete the chunk map file.

yawqi · 2022-06-23T07:12:14Z

I am thinking that adding another flag delete in the chunk map's header which can be used to indicate the status of the current chunk map file when we need to cull the blob file, we first set the delete flag in the header. If it can be culled, we may delete the chunk map file afterward.

pub(crate) struct Header {
    /// PersistMap magic number
    pub magic: u32,
    pub version: u32,
    pub magic2: u32,
    pub all_ready: u32,
    __pub delete: u32__,
    pub reserved: [u8; HEADER_RESERVED_SIZE],
}

Or should I go with the nydus-snapshotter way?

jiangliu · 2022-06-26T17:18:01Z

I like this way:)
We could set the flag, delete the chunkmap file from fs.
All clients holding active fd to the chunkmap file won't get broken and will eventually see the flag.
And the chunkmap can't be used by any new clients because it has been removed from fs.

Add blob gc for fscache by using `inuse` and `cull`, this implementation reference the implementation of the `cachefilesd`, but since we use ondemand mode, so it's sightly different. The gc watermark for now is 95% used blocks or 95% used files, in the future, we need to change it configurable. Signed-off-by: Qi Wang <[email protected]>

Signed-off-by: Qi Wang <[email protected]>

yqleng1987 · 2022-06-29T10:26:01Z

@yawqi , your pull request has been updated. A new test job will be submitted. Please wait in patience.

yqleng1987 · 2022-06-29T10:26:03Z

@yawqi , your test job has passed, and no need to test again.

jiangliu · 2022-07-01T06:18:55Z

Seems something gets wrong with CI，could you please help to retrigger the CI?

Signed-off-by: Qi Wang <[email protected]>

yawqi · 2022-07-01T07:18:44Z

I am currently occupied by the school affairs, is it ok if I close this PR temporarily? After these two months when I have passed the mid-term of my graduation project, I will have more time to work on this. Much thanks!

jiangliu · 2022-07-02T02:07:16Z

Let‘s just keep it open:)

adamqqqplay · 2023-03-23T03:16:54Z

@yawqi Hi, how about your dealing with this problem?

yawqi · 2023-03-23T04:13:13Z

@adamqqqplay Sorry about that, I kinda forget about this PR, is this feature still needed? It seems there is a discussion about this feature last year, so I didn't work on it since then. I apologize for my delay. Should I keep on working on this feature or close it?

adamqqqplay · 2023-03-23T06:21:04Z

@yawqi I found a similar PR #894, but I'm not sure if it has the same feature. Could you please confirm it if you have time?
cc @changweige @kevinXYin

yawqi · 2023-03-23T08:08:22Z

It seems that this feature has been implemented in the nydus-snapshotter, maybe this PR should be closed?
containerd/nydus-snapshotter#262

hsiangkao · 2023-03-24T02:30:33Z

why we need to bother with whether close it or not?
IOWs, I have to say, watermark from the kernel is needed anyway, so this PR should be needed eventually.

yawqi force-pushed the gc-backup branch from 5a411c4 to 42d51da Compare June 21, 2022 04:09

yqleng1987 added the anolis_testing label Jun 21, 2022

yqleng1987 added anolis_test_pass and removed anolis_testing labels Jun 21, 2022

yawqi force-pushed the gc-backup branch from 42d51da to 4e1c901 Compare June 29, 2022 08:17

Add deleted flag in the header of chunk map file

5f32d3b

Signed-off-by: Qi Wang <[email protected]>

yawqi force-pushed the gc-backup branch from 4e1c901 to c785f17 Compare June 29, 2022 10:25

yqleng1987 removed the anolis_test_pass label Jun 29, 2022

Add TODO for fscache blob gc

a6eed6e

Signed-off-by: Qi Wang <[email protected]>

yawqi force-pushed the gc-backup branch from c785f17 to a6eed6e Compare July 1, 2022 06:29

imeoer mentioned this pull request Oct 19, 2022

fscache support cache gc #803

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP fscache: blob gc feature for fscache with inuse and cull #518

WIP fscache: blob gc feature for fscache with inuse and cull #518

yawqi commented Jun 21, 2022

jiangliu commented Jun 21, 2022

yawqi commented Jun 21, 2022 •

edited

yqleng1987 commented Jun 21, 2022

yqleng1987 commented Jun 21, 2022

yqleng1987 commented Jun 21, 2022

jiangliu commented Jun 21, 2022

changweige commented Jun 21, 2022

yawqi commented Jun 21, 2022 •

edited

yawqi commented Jun 21, 2022

yawqi commented Jun 22, 2022

yawqi commented Jun 22, 2022

yawqi commented Jun 22, 2022

jiangliu commented Jun 22, 2022

yawqi commented Jun 23, 2022

jiangliu commented Jun 26, 2022

yqleng1987 commented Jun 29, 2022

yqleng1987 commented Jun 29, 2022

jiangliu commented Jul 1, 2022

yawqi commented Jul 1, 2022

jiangliu commented Jul 2, 2022

adamqqqplay commented Mar 23, 2023

yawqi commented Mar 23, 2023

adamqqqplay commented Mar 23, 2023

yawqi commented Mar 23, 2023

hsiangkao commented Mar 24, 2023 •

edited

WIP fscache: blob gc feature for fscache with inuse and cull #518

Are you sure you want to change the base?

WIP fscache: blob gc feature for fscache with inuse and cull #518

Conversation

yawqi commented Jun 21, 2022

jiangliu commented Jun 21, 2022

yawqi commented Jun 21, 2022 • edited

yqleng1987 commented Jun 21, 2022

yqleng1987 commented Jun 21, 2022

yqleng1987 commented Jun 21, 2022

jiangliu commented Jun 21, 2022

changweige commented Jun 21, 2022

yawqi commented Jun 21, 2022 • edited

yawqi commented Jun 21, 2022

yawqi commented Jun 22, 2022

yawqi commented Jun 22, 2022

yawqi commented Jun 22, 2022

jiangliu commented Jun 22, 2022

yawqi commented Jun 23, 2022

jiangliu commented Jun 26, 2022

yqleng1987 commented Jun 29, 2022

yqleng1987 commented Jun 29, 2022

jiangliu commented Jul 1, 2022

yawqi commented Jul 1, 2022

jiangliu commented Jul 2, 2022

adamqqqplay commented Mar 23, 2023

yawqi commented Mar 23, 2023

adamqqqplay commented Mar 23, 2023

yawqi commented Mar 23, 2023

hsiangkao commented Mar 24, 2023 • edited

yawqi commented Jun 21, 2022 •

edited

yawqi commented Jun 21, 2022 •

edited

hsiangkao commented Mar 24, 2023 •

edited