Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: revert PR #41897 to allow multiple delete markers to be created #54957

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jzhu116-bloomberg
Copy link
Contributor

@jzhu116-bloomberg jzhu116-bloomberg commented Dec 19, 2023

Please see the following links for more discussion and info:
https://tracker.ceph.com/issues/63799
https://docs.aws.amazon.com/AmazonS3/latest/userguide/ManagingDelMarkers.html

The PR has been (partially) reverted: #41897

Tests have been done on the following operations/processes that involve delete markers, and all work well.

  • lc
  • multisite replication
  • delete object with delete marker(s)
  • delete delete markers

Fixes: https://tracker.ceph.com/issues/63799
Signed-off-by: Jane Zhu [email protected]

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@jzhu116-bloomberg jzhu116-bloomberg requested a review from a team as a code owner December 19, 2023 03:48
@github-actions github-actions bot added the rgw label Dec 19, 2023
@smanjara smanjara self-requested a review January 2, 2024 15:29
@smanjara
Copy link
Contributor

smanjara commented Jan 2, 2024

thanks for testing the fix @jzhu116-bloomberg

what is your observation while deleting delete markers via lc?
do we create another delete marker or remove the expired object delete marker?
from aws documentation, lc expiration should behave in the following two ways:

-  Amazon S3 doesn't take any action if there are one or more object versions and the delete marker is the current version.

-  If the current object version is the only object version and it is also a delete marker (also referred as an expired object delete marker, where all object versions are deleted and you only have a delete marker remaining), Amazon S3 removes the expired object delete marker. You can also use the expiration action to direct Amazon S3 to remove any expired object delete markers.

see https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html

so, while we allow DeleteObject request to create more than one delete markers (this PR takes care of that), we should also ensure lifecycle expiration to behave as described above w.r.t to deleting delete markers.

we could discuss this further in our weekly meeting.

@jzhu116-bloomberg
Copy link
Contributor Author

thanks for testing the fix @jzhu116-bloomberg

what is your observation while deleting delete markers via lc? do we create another delete marker or remove the expired object delete marker? from aws documentation, lc expiration should behave in the following two ways:

-  Amazon S3 doesn't take any action if there are one or more object versions and the delete marker is the current version.

-  If the current object version is the only object version and it is also a delete marker (also referred as an expired object delete marker, where all object versions are deleted and you only have a delete marker remaining), Amazon S3 removes the expired object delete marker. You can also use the expiration action to direct Amazon S3 to remove any expired object delete markers.

According to my tests, when the lc is deleting delete-markers, it never creates another delete-marker. However it doesn't exactly follow what the aws doc says either. Here is my observation:

-  Amazon S3 doesn't take any action if there are one or more object versions and the delete marker is the current version.

The above works as it's described in the doc.

-  If the current object version is the only object version and it is also a delete marker (also referred as an expired object delete marker, where all object versions are deleted and you only have a delete marker remaining), Amazon S3 removes the expired object delete marker. You can also use the expiration action to direct Amazon S3 to remove any expired object delete markers.

For the above, we have 2 issues:

  1. In the Expiration action, according to the aws doc,
    When you specify the Days tag, Amazon S3 automatically performs ExpiredObjectDeleteMarker cleanup when the delete markers are old enough to satisfy the age criteria.

This doesn't happen in my testing. The expired delete-marker is NOT being cleaned-up. ExpiredObjectDeleteMarker works fine though.

Please note. this has nothing to do with this PR since it's reproducible in a main branch without this PR.

My question: should we create an issue tracker and fix this?

  1. When there are more then one expired delete-markers (no current/non-current object versions exist), ExpiredObjectDeleteMarker only delete one of them in one lc process. Running multiple rounds of lc processes can eventually delete all of them.

My question: I guess we can treat this as a minor issue? Should we fix this?

@jzhu116-bloomberg
Copy link
Contributor Author

For the above, we have 2 issues:

  1. In the Expiration action, according to the aws doc,
    When you specify the Days tag, Amazon S3 automatically performs ExpiredObjectDeleteMarker cleanup when the delete markers are old enough to satisfy the age criteria.

This doesn't happen in my testing. The expired delete-marker is NOT being cleaned-up. ExpiredObjectDeleteMarker works fine though.

An issue tracker has been opened for this issue: https://tracker.ceph.com/issues/63995

  1. When there are more then one expired delete-markers (no current/non-current object versions exist), ExpiredObjectDeleteMarker only delete one of them in one lc process. Running multiple rounds of lc processes can eventually delete all of them.

An issue tracker has been opened for this: https://tracker.ceph.com/issues/63997

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Mar 11, 2024
@jzhu116-bloomberg
Copy link
Contributor Author

Hi @smanjara , I wonder how we want to go ahead with this? As far as I understood, we are waiting for some test cases to be created right?

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Sep 17, 2024
@cbodley cbodley removed the stale label Sep 17, 2024
@adamemerson
Copy link
Contributor

@smanjara Hi, there. Sorry to bug you, I'm wondering if we can move this forward. Are we needing a specific s3test to be written as part of the usual gamut for this to merge in?

@smanjara
Copy link
Contributor

@smanjara Hi, there. Sorry to bug you, I'm wondering if we can move this forward. Are we needing a specific s3test to be written as part of the usual gamut for this to merge in?

hi @adamemerson currently there is no test coverage in multisite suite for delete markers. I think that is what is missing here.

@jzhu116-bloomberg
Copy link
Contributor Author

@smanjara Hi, there. Sorry to bug you, I'm wondering if we can move this forward. Are we needing a specific s3test to be written as part of the usual gamut for this to merge in?

hi @adamemerson currently there is no test coverage in multisite suite for delete markers. I think that is what is missing here.

@smanjara I remember you mentioned that you would add something for this when you would be fixing some multisite tests issues? I wonder how is that going along? Do you want me to take on adding multisite test cases for delete markers?

@smanjara
Copy link
Contributor

@smanjara Hi, there. Sorry to bug you, I'm wondering if we can move this forward. Are we needing a specific s3test to be written as part of the usual gamut for this to merge in?

hi @adamemerson currently there is no test coverage in multisite suite for delete markers. I think that is what is missing here.

@smanjara I remember you mentioned that you would add something for this when you would be fixing some multisite tests issues? I wonder how is that going along? Do you want me to take on adding multisite test cases for delete markers?

hi @jzhu116-bloomberg I haven't had a chance to do it. I know that you have mentioned that it all works well for you with the changes in this PR. it would be really helpful if you could add the test cases. thanks!

@jzhu116-bloomberg
Copy link
Contributor Author

@smanjara I remember you mentioned that you would add something for this when you would be fixing some multisite tests issues? I wonder how is that going along? Do you want me to take on adding multisite test cases for delete markers?

hi @jzhu116-bloomberg I haven't had a chance to do it. I know that you have mentioned that it all works well for you with the changes in this PR. it would be really helpful if you could add the test cases. thanks!

Sounds good! then I'll add some test cases for this.

@cbodley
Copy link
Contributor

cbodley commented Nov 20, 2024

do we have s3tests coverage for the lifecycle behavior? its generation of stacking delete markers is what motivated the original change. if we revert, we really need to prove that part was fixed

@jzhu116-bloomberg
Copy link
Contributor Author

do we have s3tests coverage for the lifecycle behavior? its generation of stacking delete markers is what motivated the original change. if we revert, we really need to prove that part was fixed

This is the s3tests coverage for the lifecycle issue: ceph/s3-tests#286.
I'll make sure this test pass with the code reverted.

@cbodley
Copy link
Contributor

cbodley commented Jan 17, 2025

do we have s3tests coverage for the lifecycle behavior? its generation of stacking delete markers is what motivated the original change. if we revert, we really need to prove that part was fixed

This is the s3tests coverage for the lifecycle issue: ceph/s3-tests#286. I'll make sure this test pass with the code reverted.

thanks for following up! for the lifecycle testing, we'll need to make sure that lifecycle processing has a chance to run multiple times in order to verify that it doesn't stack the delete markers. we'll also want a non-lifecycle test case to verify that multiple delete_object() calls do stack the delete markers

@jzhu116-bloomberg
Copy link
Contributor Author

Test cases:

  • to make sure that lifecycle processing runs multiple times in order to verify that it doesn't stack the delete markers
    s3 test case: test_lifecycle_expiration_versioning_enabled
    This line indicates that the lifecycle processing runs multiple times.

  • add a non-lifecycle s3 test case for stacked delete markers
    PR: add test case for stacked delete-markers s3-tests#614

  • a new multisite delete marker test case has been added to this PR, and 2 existing multisite delete marker related test cases have been updated to test stacked delete markers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants