Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many non-root deployments fail with permission denied when trying to write to QNAP volumes #16

Open
pkerwien opened this issue Sep 2, 2024 · 5 comments

Comments

@pkerwien
Copy link
Contributor

pkerwien commented Sep 2, 2024

When deploying 3rd party applications like mariadb-operator, cloudnative-pg and bitnami/mariadb helm chart, they all fail when using a PVC on the QNAP NAS.

This happens since the CSI driver is not changing the volume permissions while mounting it when fsGroup is used in the manifests to allow the non-root container user to write to the filesystem.

Using this demo deployment with both Longhorn and QNAP-CSI:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mariadb
spec:
  selector:
    matchLabels:
      app: mariadb
  template:
    metadata:
      labels:
        app: mariadb
    spec:
      securityContext:
        fsGroup: 999
        fsGroupChangePolicy: Always
      containers:
        - name: mariadb
          image: mariadb:11.4.3
          ports:
            - containerPort: 3306
          env:
            - name: MARIADB_ROOT_PASSWORD
              value: t0ps3cr3t
          command: ["sleep", "9999999"]
          volumeMounts:
            - mountPath: /var/lib/mysql
              name: db-data
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            privileged: false
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            seccompProfile:
              type: RuntimeDefault	
            runAsUser: 999
            runAsGroup: 999
      volumes:
        - name: db-data
          persistentVolumeClaim:
            claimName: mariadb

Results in the following when using Longhorn CSI:

$ kubectl exec -it mariadb-6d88bd45c4-kwrmz -- ls -ldn /var/lib/mysql
drwxrwsr-x 3 0 999 4096 Sep  1 10:36 /var/lib/mysql

With QNAP-CSI:

PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: mariadb
  annotations:
    trident.qnap.io/ThinAllocate: "true"
    trident.qnap.io/threshold: "80"
    # QuTS-hero features
    trident.qnap.io/Deduplication: "false"
    trident.qnap.io/Compression: "true"
    trident.qnap.io/FastClone: "true"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
  storageClassName: standard
$ kubectl exec -it mariadb-5576bb9b88-29tbh -- ls -ldn /var/lib/mysql
drwxr-xr-x 3 0 0 4096 Sep  2 09:20 /var/lib/mysql

Since the container UID in this example is 999, the user can write to the volume when using Longhorn, but not when using QNAP-CSI.

Another example is using the mariadb-operator to deploy mariadb databases. The database pods will fail with:

2024-09-02 09:32:17+00:00 [Note] [Entrypoint]: Entrypoint script for MariaDB Server 1:11.4.3+maria~ubu2404 started.
2024-09-02 09:32:17+00:00 [Note] [Entrypoint]: Initializing database files
2024-09-02  9:32:17 0 [Warning] Can't create test file '/var/lib/mysql/mariadb.lower-test' (Errcode: 13 "Permission denied")
2024-09-02  9:32:17 0 [ERROR] mariadbd: Can't create/write to file './ddl_recovery.log' (Errcode: 13 "Permission denied")
2024-09-02  9:32:17 0 [ERROR] DDL_LOG: Failed to create ddl log file: ./ddl_recovery.log
2024-09-02  9:32:17 0 [ERROR] Aborting

Installation of system tables failed!  Examine the logs in
/var/lib/mysql/ for more information.

The problem could be conflicting information in an external
my.cnf files. You can ignore these by doing:

    shell> /usr/bin/mariadb-install-db --defaults-file=~/.my.cnf

You can also try to start the mariadbd daemon with:

    shell> /usr/sbin/mariadbd --skip-grant-tables --general-log &

and use the command line tool /usr/bin/mariadb
to connect to the mysql database and look at the grant tables:

    shell> /usr/bin/mariadb -u root mysql
    MariaDB> show tables;

Try '/usr/sbin/mariadbd --help' if you have problems with paths.  Using
--general-log gives you a log in /var/lib/mysql/ that may be helpful.

The latest information about mariadb-install-db is available at
https://mariadb.com/kb/en/installing-system-tables-mysql_install_db
You can find the latest source at https://downloads.mariadb.org and
the maria-discuss email list at https://launchpad.net/~maria-discuss

Please check all of the above before submitting a bug report
at https://mariadb.org/jira

Please add necessary fsGroup support into the CSI driver so all these non-root applications can be deployed using QNAP volumes.

My setup is:

  • QuTS hero h5.2.0.2860
  • RKE2 v1.30.4-rke2r1 on Ubuntu Server 24.04.1
  • QNAP-CSI-Plugin 1.3.0
  • Longhorn 1.7.0

My storage class for QNAP PVCs is:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: csi.trident.qnap.io
parameters:
  selector: "performance=standard"
allowVolumeExpansion: true
@pkerwien pkerwien changed the title Many non-root deployments fail with permission denied when trying to access QNAP volumes Many non-root deployments fail with permission denied when trying to write to QNAP volumes Sep 2, 2024
@davidcheng0716
Copy link
Collaborator

davidcheng0716 commented Sep 3, 2024

@pkerwien
thank you for the comment; maybe the following steps can help you

(1) kubectl edit csidriver csi.trident.qnap.io
(2) fsGroupPolicy : "ReadWriteOnceWithFSType" ----> "File"

Screenshot from 2024-09-03 17-15-54

Screenshot from 2024-09-03 17-16-49

@pkerwien
Copy link
Contributor Author

pkerwien commented Sep 3, 2024

Thanks! This looks promising. The mariadb-operator now managed to deploy a DB cluster. I will do more testning later.

@pkerwien
Copy link
Contributor Author

pkerwien commented Sep 3, 2024

@davidcheng0716 All previously failed deployments work now! Can the changes be made during installation of QNAP CSI (to avoid patching) or will this CSI driver change be default in a future release?

@pkerwien
Copy link
Contributor Author

pkerwien commented Sep 3, 2024

@davidcheng0716 I just discovered that the Longhorn CSI driver uses fsGroupPolicy: ReadWriteOnceWithFSType (same as QNAP CSI before patching). And in the longhorn storage class, I can see fsType: ext4. Perhaps that is the reason fsGroup works as expected when using Longhorn. From https://kubernetes-csi.github.io/docs/support-fsgroup.html:

"ReadWriteOnceWithFSType: Indicates that volumes will be examined to determine if volume ownership and permissions should be modified to match the pod's security policy. Changes will only occur if the fsType is defined and the persistent volume's accessModes contains ReadWriteOnce."

In my QNAP storage class, there is no such fsType parameter. Not sure if I can add one or if that would make it work without having to patch the CSIDriver.

pkerwien added a commit to pkerwien/QNAP-CSI-PlugIn that referenced this issue Sep 4, 2024
This will fix issue qnap-dev#16.

Signed-off-by: Peter Kerwien <[email protected]>
@LeonaChen2727
Copy link
Contributor

We appreciate your suggestion and will consider setting it as the default in a future version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants