Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Trident operator fails to install via Helm on Rancher #839

Open
lindhe opened this issue Jul 5, 2023 · 3 comments · May be fixed by #840
Open

The Trident operator fails to install via Helm on Rancher #839

lindhe opened this issue Jul 5, 2023 · 3 comments · May be fixed by #840

Comments

@lindhe
Copy link

lindhe commented Jul 5, 2023

Describe the bug

When installing the Trident operator from the Helm chart in a Kubernetes cluster managed by Rancher, the operator fails because it is unable to add the PSA label pod-security.kubernetes.io/enforce: privileged on its installation namespace. This is because Rancher has a special admission webhook in place for setting PSA labels, which must be granted to the ServiceAccount, on top of all the other RBAC rules it needs.

Environment

  • Trident version: 23.04.0
  • Trident installation flags used: helm install trident netapp-trident/trident-operator --version 23.04.0 --create-namespace --namespace trident
  • Container runtime: Containerd v1.6.19-k3s1
  • Kubernetes version: v1.25.9
  • Kubernetes orchestrator: Rancher v2.7.5
  • Kubernetes enabled feature gates: None.
  • OS: Ubuntu 22.04.2 LTS
  • NetApp backend types: n/a
  • Other: n/a

To Reproduce

  1. Have a Rancher managed RKE2 cluster (but I'm guessing it'll work with any Rancher managed cluster).

  2. helm repo add netapp-trident https://netapp.github.io/trident-helm-chart

  3. helm install trident netapp-trident/trident-operator --version 23.04.0 --create-namespace --namespace trident

  4. Check the status of the installed CRDs, thetrident TridentOrchestrator object and the pods deployed:

    $ kubectl get crd | grep trident
    tridentorchestrators.trident.netapp.io                            2023-06-28T14:56:46Z
    
    $ kubectl -n trident get pods
    NAME                                 READY    STATUS    RESTARTS    AGE
    trident-operator-5789cf4777-nc4vn    1/1      Runnnig   0           7m32s
    
    $ kubectl -n trident get tridentorchestrators trident -o yaml
     […]
     status:
       message: 'Failed to install Trident; err: failed to patch Trident installation namespace
         trident; admission webhook "rancher.cattle.io.namespaces" denied the request:
         Unauthorized'
       namespace: trident
       status: Failed
       version: ""

Expected behavior

I expect it to deploy as it should and not crash. Here's an example of what it looks like when deploying successfully:

$ kubectl -n trident get pods
NAME                                  READY   STATUS    RESTARTS   AGE
trident-controller-6d7c9c5d8c-wg8zj   6/6     Running   0          4h28m
trident-node-linux-4tk6q              2/2     Running   0          4h28m
trident-node-linux-97rgx              2/2     Running   0          4h28m
trident-node-linux-9jfbh              2/2     Running   0          4h28m
trident-node-linux-btjx6              2/2     Running   0          4h28m
trident-node-linux-n5k75              2/2     Running   0          4h28m
trident-node-linux-vpcgd              2/2     Running   0          4h28m
trident-operator-5789cf4777-66mth     1/1     Running   0          4h29m

$ kubectl get crd | grep trident
tridentbackendconfigs.trident.netapp.io                           2023-07-05T08:09:56Z
tridentbackends.trident.netapp.io                                 2023-07-05T08:09:55Z
tridentmirrorrelationships.trident.netapp.io                      2023-07-05T08:10:00Z
tridentnodes.trident.netapp.io                                    2023-07-05T08:09:58Z
tridentorchestrators.trident.netapp.io                            2023-06-28T14:56:46Z
tridentsnapshotinfos.trident.netapp.io                            2023-07-05T08:09:56Z
tridentsnapshots.trident.netapp.io                                2023-07-05T08:09:59Z
tridentstorageclasses.trident.netapp.io                           2023-07-05T08:09:56Z
tridenttransactions.trident.netapp.io                             2023-07-05T08:09:59Z
tridentversions.trident.netapp.io                                 2023-07-05T08:09:55Z
tridentvolumepublications.trident.netapp.io                       2023-07-05T08:09:57Z
tridentvolumereferences.trident.netapp.io                         2023-07-05T08:10:00Z
tridentvolumes.trident.netapp.io                                  2023-07-05T08:09:57Z

Additional context

This was already reported to Rancher's GitHub page as issue #41191. People (understandably) thought that this was a bug in Rancher, while it's more of a documentation issue on their part (in my opinion).

There's also some information available in the operator's pod logs. I don't have them easily available right now, but it basically amounts to the same message as the one displayed by the TridentOrchestrator object anyway; it fails to patch the trident namespace because the Rancher admission webhook rancher.cattle.io.namespaces denied the request (Unauthorized).

Work-around

Inspired by this comment from the issue reported to Rancher's GitHub page, applying the following manifest and then restarting the operator fixes the issue:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: trident-operator-psa
rules:
- apiGroups:
  - management.cattle.io
  resources:
  - projects
  verbs:
  - updatepsa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: trident-operator-psa
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: trident-operator-psa
subjects:
- kind: ServiceAccount
  name: trident-operator
  namespace: trident
@lindhe lindhe added the bug label Jul 5, 2023
lindhe added a commit to lindhe/trident that referenced this issue Jul 6, 2023
In Rancher, it is not enough to have `patch` permissions for a namespace
in order to set PSA labels.
It is also required to have the `updatepsa` permission on the `projects`
resource, as outlined
[here](rancher/rancher#41191).

This rule allows the Trident operator to set the PSA label
`pod-security.kubernetes.io/enforce: privileged` on its installation
namespace in Rancher.

Closes NetApp#839
@lindhe lindhe linked a pull request Jul 6, 2023 that will close this issue
@nheinemans
Copy link

nheinemans commented Jul 12, 2023

We're running into the same issue after upgrading from Rancher 2.6.11 to 2.7.5. I can confirm that your workaround fixes the issue.

@gnarl gnarl added the tracked label Jul 12, 2023
@Philbow
Copy link

Philbow commented Aug 7, 2023

@lindhe: Thanks for bringing this up and creating the corresponding pull request. I can confirm as well, that this solves the issue in my cluster.

Does NetApp has a plan to merge this at some point in time? Applying these workarounds in automation is a bit cumbersome and unclean.

@nheinemans-asml
Copy link

We're still seeing the same issue in Rancher 2.7.9 and Trident 23.10.0. Can we perhaps get an update from Netapp on this issue and the pending PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants