Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OrphanedObjects: detect object vs. multipart vs. unexpected file #2

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tserong
Copy link

@tserong tserong commented Aug 9, 2023

This expands OrphanedObjectsCheck::check() to understand different types of file, based on what their filenames look like. The rules here are:

  • If the filename is an integer number (i.e. matches "^[0-9]+$") then it's assumed to be a versioned object, and the concatenation of its parent directories provides the object UUID.
  • If the filename looks like most of a UUID, but with a dash and an integer stuck on the end, it's part of a multipart upload.
  • If the filename doesn't match one of the above, it shouldn't be there at all.

The current implementation doesn't explicitly check that parent directories are sane given the file type, but if (for example) an integer-looking-filename (assumed versioned object) appears at the wrong level in the directory hierarchy, it will still be reported as an orphaned object because the UUID obtained from path concatenation won't match anything in the database.

This expands OrphanedObjectsCheck::check() to understand different
types of file, based on what their filenames look like.  The rules
here are:

- If the filename is an integer number (i.e. matches "^[0-9]+$")
  then it's assumed to be a versioned object, and the concatenation
  of its parent directories provides the object UUID.
- If the filename looks like most of a UUID, but with a dash and
  an integer stuck on the end, it's part of a multipart upload.
- If the filename doesn't match one of the above, it shouldn't be
  there at all.

The current implementation doesn't explicitly check that parent
directories are sane given the file type, but if (for example)
an integer-looking-filename (assumed versioned object) appears at
the wrong level in the directory hierarchy, it will still be
reported as an orphaned object because the UUID obtained from path
concatenation won't match anything in the database.

Signed-off-by: Tim Serong <[email protected]>
This commit also adds an UnexpectedFileFix class.

Signed-off-by: Tim Serong <[email protected]>
@tserong tserong force-pushed the wip-enhance-orphaned-objects branch from 55159fd to b6715e4 Compare August 10, 2023 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant