From 26b21ab98783ffb32fc01ec3f0627b07aa783662 Mon Sep 17 00:00:00 2001 From: Erik Sundell Date: Wed, 6 Mar 2024 11:25:00 +0100 Subject: [PATCH] docs: update cluster design mentions of config connector --- docs/topic/infrastructure/cluster-design.md | 48 ++++++++++----------- terraform/gcp/cluster.tf | 1 + 2 files changed, 23 insertions(+), 26 deletions(-) diff --git a/docs/topic/infrastructure/cluster-design.md b/docs/topic/infrastructure/cluster-design.md index 089cc63c5f..ce41013e38 100644 --- a/docs/topic/infrastructure/cluster-design.md +++ b/docs/topic/infrastructure/cluster-design.md @@ -134,29 +134,25 @@ to isolate them from each other. ## Cloud access credentials for hub users -For hub users to access cloud resources (like storage buckets), they will need -to be authorized via a [GCP ServiceAccount](https://cloud.google.com/iam/docs/service-accounts). -This is different from a [Kubernetes ServiceAccount](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/), -which is used to authenticate and authorize access to kubernetes resources (like spawning pods). - -For dask hubs, we want to provide users with write access to at least one storage -bucket they can use for temporary data storage. User pods need to be given access to -a GCP ServiceAccount that has write permissions to this bucket. There are two ways -to do this: - -1. Provide appropriate permissions to the GCP ServiceAccount used by the node the user - pods are running on. When used with [Metadata Concealment](https://cloud.google.com/kubernetes-engine/docs/how-to/protecting-cluster-metadata#overview), - user pods can read / write from storage buckets. However, this grants the same permissions - to *all* pods on the cluster, and hence is unsuitable for clusters with multiple - hubs running for different organizations. - -2. Use the [GKE Cloud Config Connector](https://cloud.google.com/config-connector/docs/overview) to - create a GCP ServiceAccount + Storage Bucket for each hub via helm. This requires using - [Workload Identity](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity) and - is incompatible with (1). This is required for multi-tenant clusters, since users on a hub - have much tighter scoped permissions. - -Long-term, (2) is the appropriate way to do this for everyone. However, it affects the size -of the core node pool, since it runs some components in the cluster. For now, we use (1) for -single-tenant clusters, and (2) for multi-tenant clusters. If nobody wants a scratch GCS bucket, -neither option is required. +For hub users to access cloud resources like storage buckets from their user +servers, they will need to have credentials from a cloud specific service +account - like a [GCP ServiceAccount]. + +Currently for practical reasons we only provision one cloud specific service +account per hub, which makes all users interaction be seen as a single user. +Note that providing for example two cloud service accounts, one for hub admin +users and one for non-admin users is by far an easier improvement than providing +one for each hub user. + +```{note} Technical notes +When we create a hub with access to a bucket, we create cloud provider specific +service account for the hub via `terraform`. We then also create a [Kubernetes +ServiceAccount] via the basehub chart's templates that references the cloud +specific service account via an annotation. When this Kubernetes ServiceAccount +is mounted to the hub's user server pods, a cloud specific controller ensures +the Pod gets credentials that can be exchanged for temporary credentials to the +cloud specific service account. + +[gcp serviceaccount]: https://cloud.google.com/iam/docs/service-accounts +[kubernetes serviceaccount]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/ +``` diff --git a/terraform/gcp/cluster.tf b/terraform/gcp/cluster.tf index d1a16f99ad..bc5c147c4e 100644 --- a/terraform/gcp/cluster.tf +++ b/terraform/gcp/cluster.tf @@ -300,6 +300,7 @@ resource "google_container_node_pool" "notebook" { workload_metadata_config { # Config Connector requires workload identity to be enabled (via GKE_METADATA_SERVER). + # Config Connector hasn't been used since March 2024, see https://github.com/2i2c-org/infrastructure/pull/3778. # If config connector is not necessary, we use simple metadata concealment # (https://cloud.google.com/kubernetes-engine/docs/how-to/protecting-cluster-metadata) # to expose the node CA to users safely.