Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update compute charts #1

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

aaperis
Copy link

@aaperis aaperis commented Nov 1, 2024

This PR updates the CC Helm charts with changes that were brought about during deployment work in the GDI cluster of the Swedish node.

Main changes summarized below:

  • small fixes in order to make the charts working
  • sane defaults for dependency resources (e.g. rbac etc) were added and can be enabled from the values file
  • changes for the latest gdi-funnel image to deploy

Chart versions and image versions:

  • Chart versions up to 0.6.2 use the old image krkoo/funnel:2024-06-12
  • version 0.7.0 uses the latest: ghcr.io/mrtamm/funnel-gdi:latest
  • versions >=0.8.0 use image ghcr.io/mrtamm/funnel-gdi:2024-11-12 (which has the sda backend implemented)

We have a chart releaser installed in our fork repo at NBIS and all releases can be pulled with:

helm repo add gdi-compute https://nbisweden.github.io/Containerized-Compute-Helm-Charts

Any comments welcome!

P.S. Latest batch of changes reflect the setup that deployed in SE where the pipeline for MS8 was run.

@aaperis aaperis marked this pull request as draft November 1, 2024 08:01
@xhejtman
Copy link
Contributor

xhejtman commented Nov 1, 2024

Any reason for rbac to alow deployments?

Also, what is the reason for removing the subPath? I believe it is more secure to mount just the subpath for the job.

@aaperis aaperis marked this pull request as ready for review November 1, 2024 10:47
@aaperis
Copy link
Author

aaperis commented Nov 1, 2024

Any reason for rbac to alow deployments?

Check under "Define a Role and RoleBinding:" in https://ohsu-comp-bio.github.io/funnel/docs/compute/kubernetes/, this part is taken from there. Note that the rbac creation can be disabled in the values.yaml so that admins can use their custom ones.

Also, what is the reason for removing the subPath? I believe it is more secure to mount just the subpath for the job.

Having the subpath there messes with folder mount access permissions. If subpath is kept, containers need to run with privileges, therefore less securely.

But this can be made optional in case someone wants to deploy the charts in a less secure setting.

@xhejtman
Copy link
Contributor

xhejtman commented Nov 1, 2024

Any reason for rbac to alow deployments?

Check under "Define a Role and RoleBinding:" in https://ohsu-comp-bio.github.io/funnel/docs/compute/kubernetes/, this part is taken from there. Note that the rbac creation can be disabled in the values.yaml so that admins can use their custom ones.

I meant, that Pods and Jobs should be enough, no need to add also Deployments. Right?

Also, what is the reason for removing the subPath? I believe it is more secure to mount just the subpath for the job.

Having the subpath there messes with folder mount access permissions. If subpath is kept, containers need to run with privileges, therefore less securely.

But this can be made optional in case someone wants to deploy the charts in a less secure setting.

yes, there is subPath permissions problem in some setups. But not in this one, because the funnel master should create the subPath directory before starting worker and if the subPath exists in advance, there should be no problem with permissions.

@aaperis
Copy link
Author

aaperis commented Nov 1, 2024

yes, there is subPath permissions problem in some setups. But not in this one, because the funnel master should create the subPath directory before starting worker and if the subPath exists in advance, there should be no problem with permissions.

The problem is that the parent folder of the subpath has the wrong permissions (owned by root). Here's an example of a failing job when running version 0.6.0 of the charts with only the changes of up to commit [c33a642](https://github.com/CERIT-SC/starter-kit-containers/pull/1/commits/c33a64214945e1bcae69b5a47612af5e01e99f7c) , i.e. keeping the original security policy settings and the subpath:

{"attempt":0,"error":"failed to copy file: failed to create dest file for copying: open /opt/funnel/funnel-work-dir/md5sum_output: permission denied","index":0,"level":"error","msg":"upload failed","ns":"worker","src":"systemlogfilter.go:github.com/ohsu-comp-bio/funnel/events.(*SystemLogFilter).WriteEvent:22","taskID":"csifipau0c5s73fl8id0","time":"2024-11-01T15:53:21Z","timestamp":"2024-11-01T15:53:21.85065552Z","url":"file:///opt/funnel/funnel-work-dir/md5sum_output"}

@xhejtman
Copy link
Contributor

xhejtman commented Nov 1, 2024

It works in our cluster. Isn't to problem with the parent folder that is mounted without subpath on the funnel master? It is independent of the subpath for worker. Also note, the subpath is only for a worker, not for the master.

@aaperis
Copy link
Author

aaperis commented Nov 5, 2024

It works in our cluster. Isn't to problem with the parent folder that is mounted without subpath on the funnel master? It is independent of the subpath for worker. Also note, the subpath is only for a worker, not for the master.

I am not 100% sure I follow how all funnel pieces are connected, but if subpath is not used in the master template then the worker will still create a folder named from the task id and will execute inside it. But this time the folder will have the correct permissions (here 65534) so that the master will be able to remove it after the task is completed and the output is uploaded elsewhere.

P.S Take a look here for a subpath behavior similar to what I observed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants