Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compaction support in robustness test #17833

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

serathius
Copy link
Member

No description provided.

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@serathius serathius force-pushed the robustness-compact branch 6 times, most recently from f87ed59 to 7ae772b Compare April 26, 2024 09:55
@serathius serathius force-pushed the robustness-compact branch 2 times, most recently from 713985e to 46e255e Compare May 8, 2024 10:03
@serathius
Copy link
Member Author

Need to fix one more case, allow gofailpoints triggered by compaction to allow to be triggered by traffic.

@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

Yey, revision returned by compaction is not linearizable.

Screenshot from 2024-05-21 21-00-38

https://github.com/etcd-io/etcd/actions/runs/9179653925?pr=17833

cc @MadhavJivrajani @fuweid @siyuanfoundation

@siyuanfoundation
Copy link
Contributor

siyuanfoundation commented May 21, 2024

https://github.com/etcd-io/etcd/actions/runs/9179653925?pr=17833

I don't understand why the line is drawn at the end of the compact op in the TestRobustnessExploratory_Etcd_HighTraffic_ClusterOfSize3 report.
There is a valid path from Compact(314), rev: 318 -> Compact(297), rev: 319 -> put("key0", "308"), rev: 319 -> ...
It seems Compact(297), rev: 319 should have returned ErrCompacted but didn't?

Screenshot 2024-05-21 at 2 45 06 PM

@serathius
Copy link
Member Author

Assuming that compaction returns linearizable revision (that might not be necessarily a property we want to provide), operations Compact(314), rev: 318 and Compact(297), rev: 319 should not coexist. Operations execute order should follow returned revision, and it should not be possible to get both success in sequence Compact(314) and Compact(297), because it doesn't make sense to compact revision 297 after 314.

As mentioned above, it's not necessarily a bug, but at least a undocumented behavior. It stems from fact that etcd puts revision in the same field of response, independent of its meaning. Depending on the request the revision can be interpreted in a different way. Some examples:

  • For PUT/TXN operations, response revision is the revision of the operation. Linearizable
  • For Read request with revision=0, the response revision is the read. Linearizable
  • For Read request with revision!=0, the response revision is the latest revision in cluster. Linearizable
  • For Watch request, for response with events it's the revision of local member, for bookmarks it's revision of the watch stream.Non-linearizable.
  • For other request I expect two cases:
    • Request goes through raft, revision is cluster revision. Linearizable. I expected Compaction is here, but it might not be.
    • Request doesn't require raft, revision is local. Non-linearizable.

@serathius
Copy link
Member Author

Here is even clearer example of this
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants