-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clone dv, and tmp-pvc lost #3470
Comments
I upgraded cdi to version 1.60.3 and encountered the same problem. Who can help me? Thank you.
|
@mhenriks Can you take a look? |
Getting similar vibes as this: #3259 Please confirm that ceph configuration is correct, StorageClass/cephblockpools/etc |
@mhenriks thank you.
|
I would make sure that you don't have a bunch of retained PVs that are unneeded and taking up space. Could be causing issues on the backend. Delete reclaimPolicy may keep that from happening |
@mhenriks Thank you. Before I create dvs, I will delete all the dvs under the wyw-test-dv namespace and dv associated pvs. According to the above records, by analyzing the dv 1c7ld which has been in the CSICloneInProgress state, it can be found that the pv requested to be bound by tmp-pvc-<targetClaim.UID> generated during the dv creation process has been bound by 1c7ld, but the status of 1c7ld is Pending, and tmp-pvc-<targetClaim.UID> is Lost. Is this reasonable? Another point is that the phases of dv in the source code are CSIClonePhase, PrepClaimPhase, and RebindPhase, but when describing 1c7ld dv, you can see that the order in Events is CloneScheduled -> Pending -> PrepClaimInProgress-> RebindInProgress -> CSICloneInProgress. For those dvs that are successfully created, the order in Events is CloneScheduled -> Pending -> CSICloneInProgress->PrepClaimInProgress-> RebindInProgress -> Bound->CloneSucceeded. |
@wywself it appears that the PV was updated to refer to the target pvc ( |
@mhenriks Thank you.
In the RebindPhase phase, after the ClaimRef of pv-xx bound to tmp-pvc is updated to target pvc, the status of tmp-pvc is lost, and then kube-controller-manager automatically updates the volumeName of target pvc to pv-xx, CDI waits for target to be bound before deleting the "lost" PVC. Is this description correct? I reproduced the problem again and found that after rebind, there were error logs such as
|
That is correct
Interesting find. I'm not sure how it would be related. But clearly CDI is updating the status very frequently. Is DV |
@mhenriks Yes. The above is the log of cdi-deployment. Which component's rateLimiter causes this? Currently, the default QPS is 5 and the default Burst is 10? Is there a way to adjust these two parameters? Thank you. |
This really reminds me of a hotloop we've seen before: You could watch the resourceVersion on the PVC to see if that's the case # k logs -n kube-system kube-controller-manager-node01 | grep pv_controller
...
I1114 09:34:29.977446 1 pv_controller_base.go:197] "Enqueued for sync" objName="local-pv-3fb65ef2" |
What happened:
A clear and concise description of what the bug is.
What you expected to happen:
all dvs are success.
Environment:
kubectl get deployments cdi-deployment -o yaml
): 1.58.1kubectl version
): 1.27.6The text was updated successfully, but these errors were encountered: