Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Over-provisioning stops working when one of the PVC is resized #1036

Open
handrea2009 opened this issue Dec 15, 2023 · 6 comments · May be fixed by #1041
Open

[Bug]: Over-provisioning stops working when one of the PVC is resized #1036

handrea2009 opened this issue Dec 15, 2023 · 6 comments · May be fixed by #1041
Labels
on-user pending on user

Comments

@handrea2009
Copy link

Describe the bug
I am using Kadalu 0.9.1 in external native mode, Gluster 10.5, and K3s
I have created a kadalu storage that uses an external gluster volume 29GB

kubectl exec -it deploy/operator -n kadalu -- bash -c 'kubectl-kadalu storage-list --status'

Name             Type        Utilization            Pvs Count      Min PV Size      Avg PV Size      Max PV Size
kadalu-read-cache  External    0/29 Gi (0%)                   0                0                0                0
kubectl get kadalustorage kadalu-read-cache -o jsonpath='{.spec.details.gluster_volname}'
read-cache
gluster volume list
read-cache

Even if the gluster volume is 29GB I can create 3 PVCs 20GB each, so far the over-provisioning is good:

kubectl get pvc
NAME                                           STATUS   VOLUME                                     CAPACITY    ACCESS MODES   STORAGECLASS               AGE
test1                                          Bound    pvc-febb8a9c-785b-4911-9c0d-a3d1d7b3bca9   20Gi        RWX            kadalu.kadalu-read-cache   62s
test2                                          Bound    pvc-542924b4-a5f2-4a1e-8da7-6da887f3b564   20Gi        RWX            kadalu.kadalu-read-cache   43s
test3                                          Bound    pvc-36d84616-8a1a-4b03-85b4-203f18919daa   20Gi        RWX            kadalu.kadalu-read-cache   32s

However, it's pretty odd that kubectl-kadalu storage-list --status show no space used a no PVCs

kubectl exec -it deploy/operator -n kadalu -- bash -c 'kubectl-kadalu storage-list --status'

Name             Type        Utilization            Pvs Count      Min PV Size      Avg PV Size      Max PV Size
kadalu-read-cache  External    0/29 Gi (0%)                   0                0                0                0

I resize one of the PVC and the resize worked (from 20GB to 23GB):

kubectl get pvc
NAME                                           STATUS   VOLUME                                     CAPACITY    ACCESS MODES   STORAGECLASS               AGE
test1                                          Bound    pvc-febb8a9c-785b-4911-9c0d-a3d1d7b3bca9   23Gi        RWX            kadalu.kadalu-read-cache   90s
test2                                          Bound    pvc-542924b4-a5f2-4a1e-8da7-6da887f3b564   20Gi        RWX            kadalu.kadalu-read-cache   71s
test3                                          Bound    pvc-36d84616-8a1a-4b03-85b4-203f18919daa   20Gi        RWX            kadalu.kadalu-read-cache   60s

Now kubectl-kadalu storage-list --status take into count only the PVC that has been resized

kubectl exec -it deploy/operator -n kadalu -- bash -c 'kubectl-kadalu storage-list --status'

Name             Type        Utilization            Pvs Count      Min PV Size      Avg PV Size      Max PV Size
kadalu-read-cache  External    23 Gi/29 Gi (78%)              1            23 Gi            23 Gi            23 Gi

If I try to create another PVC 20GB it stay pending forever:

kubectl get pvc
test1                                          Bound     pvc-febb8a9c-785b-4911-9c0d-a3d1d7b3bca9   23Gi        RWX            kadalu.kadalu-read-cache   36m
test2                                          Bound     pvc-542924b4-a5f2-4a1e-8da7-6da887f3b564   20Gi        RWX            kadalu.kadalu-read-cache   35m
test3                                          Bound     pvc-36d84616-8a1a-4b03-85b4-203f18919daa   20Gi        RWX            kadalu.kadalu-read-cache   35m
test4                                          Pending                                                                         kadalu.kadalu-read-cache   34m
kubectl describe pvc test4
Name:          test4
Namespace:     default
StorageClass:  kadalu.kadalu-read-cache
Status:        Pending
Volume:
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kadalu
               volume.kubernetes.io/storage-provisioner: kadalu
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type     Reason                Age                 From                                                                  Message
  ----     ------                ----                ----                                                                  -------
  Normal   Provisioning          94s (x15 over 35m)  kadalu_kadalu-csi-provisioner-0_6e9906fe-7887-4836-bce7-173516e98dad  External provisioner is provisioning volume for claim "default/test4"
  Warning  ProvisioningFailed    94s (x15 over 35m)  kadalu_kadalu-csi-provisioner-0_6e9906fe-7887-4836-bce7-173516e98dad  failed to provision volume with StorageClass "kadalu.kadalu-read-cache": rpc error: code = ResourceExhausted desc = External resource is exhausted
  Normal   ExternalProvisioning  0s (x142 over 35m)  persistentvolume-controller                                           waiting for a volume to be created, either by external provisioner "kadalu" or manually created by system administrator
@handrea2009
Copy link
Author

Debug logs:

[2023-12-15 17:30:38,414] DEBUG [controllerserver - 100:CreateVolume] - Create Volume request    request=name: "pvc-f3a76282-c6e0-42ab-8009-07985e51ed82"
capacity_range {
  required_bytes: 21474836480
}
volume_capabilities {
  mount {
  }
  access_mode {
    mode: MULTI_NODE_MULTI_WRITER
  }
}
parameters {
  key: "gluster_hosts"
  value: "cluster-node1"
}
parameters {
  key: "gluster_volname"
  value: "read-cache"
}
parameters {
  key: "hostvol_type"
  value: "External"
}
parameters {
  key: "single_pv_per_pool"
  value: "False"
}

[2023-12-15 17:30:38,420] DEBUG [volumeutils - 1175:mount_glusterfs] - Already mounted   mount=/mnt/kadalu-read-cache
[2023-12-15 17:30:38,435] DEBUG [volumeutils - 1175:mount_glusterfs] - Already mounted   mount=/mnt/kadalu-write-cache
[2023-12-15 17:30:38,441] DEBUG [controllerserver - 161:CreateVolume] - Found PV type    pvtype=subvol capabilities=[mount {
}
access_mode {
  mode: MULTI_NODE_MULTI_WRITER
}
]
[2023-12-15 17:30:38,441] DEBUG [controllerserver - 174:CreateVolume] - Filters applied to choose storage        hostvol_type=External gluster_hosts=cluster-node1 single_pv_per_pool=False gluster_volname=read-cache
[2023-12-15 17:30:38,442] DEBUG [controllerserver - 185:CreateVolume] - Got list of hosting Volumes      volumes=kadalu-read-cache,kadalu-write-cache
[2023-12-15 17:30:38,447] DEBUG [volumeutils - 1175:mount_glusterfs] - Already mounted   mount=/mnt/kadalu-read-cache
[2023-12-15 17:30:38,448] DEBUG [volumeutils - 1406:check_external_volume] - Mount successful    hvol={'name': 'kadalu-read-cache', 'type': 'External', 'g_volname': 'read-cache', 'g_host': 'cluster-node1', 'g_options': '', 'single_pv_per_pool': False}
[2023-12-15 17:30:38,530] DEBUG [volumeutils - 443:is_hosting_volume_free] - pv stats    hostvol=kadalu-read-cache total_size_bytes=31509606400 used_size_bytes=24696061952 free_size_bytes=6813544448 number_of_pvs=1 required_size=21474836480 reserved_size=681354444.8
[2023-12-15 17:30:38,530] ERROR [controllerserver - 262:CreateVolume] - Hosting volume is full. Add more storage         volume=kadalu-read-cache

@handrea2009
Copy link
Author

Same issue is present in Kadalu 1.2.0
Issue is not present in Kadalu 0.8.14, though in this release command kubectl-kadalu storage-list --status doesn't work

# kubectl exec -it deploy/operator -n kadalu -- bash -c 'kubectl-kadalu storage-list --status' Traceback (most recent call last): File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/bin/kubectl-kadalu/__main__.py", line 117, in <module> File "/usr/bin/kubectl-kadalu/__main__.py", line 108, in main File "/usr/bin/kubectl-kadalu/storage_list.py", line 237, in run File "/usr/bin/kubectl-kadalu/storage_list.py", line 197, in fetch_status IndexError: list index out of range

@handrea2009
Copy link
Author

handrea2009 commented Dec 20, 2023

If the logic in "expansion" should be the same as in "create" then the update_free_size() should't be called for PV_TYPE_SUBVOL even during "expansion", while currently for PV_TYPE_SUBVOL it's not called in "created" and called in "expansion"

@amarts
Copy link
Member

amarts commented Jan 16, 2024

Is it possible to send the PR if the fixes in update_free_size() works?

@handrea2009
Copy link
Author

handrea2009 commented Jan 18, 2024

Before doing a PR I guess we have to establish whether Kadalu support over-provisioning for External native mode or not.
That's not clear to me cause the code doesn't call update_free_size() during PVC create (so you can create as many PVC as you want, even over the space available in the external gluster volume). However, when a PVC is expanded the update_free_size() is called to update the space available in the external gluster volume.
If we support over-provisioning we should never verify the space available in the gluster volume before creating or expanding a PVC.
If we don't support over-provisioning then we should call update_free_size() both during creation and during expansion.

@leelavg
Copy link
Collaborator

leelavg commented Apr 12, 2024

If we don't support over-provisioning then we should call update_free_size() both during creation and during expansion.

  • as commented in the PR, I believe this should be fix, i.e, don't support over-provision

@leelavg leelavg added the on-user pending on user label Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on-user pending on user
Projects
None yet
3 participants