Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually create a cluster (no terraform) #5355

Closed
3 tasks done
Tracked by #5351
GeorgianaElena opened this issue Jan 8, 2025 · 31 comments
Closed
3 tasks done
Tracked by #5351

Manually create a cluster (no terraform) #5355

GeorgianaElena opened this issue Jan 8, 2025 · 31 comments
Assignees

Comments

@GeorgianaElena
Copy link
Member

GeorgianaElena commented Jan 8, 2025

Resources to get us started:

Definition of done:

@GeorgianaElena GeorgianaElena changed the title 3. [2d] Manually create a cluster (no terraform) - Est. 5 [2 days] Manually create a cluster (no terraform) Jan 8, 2025
@GeorgianaElena
Copy link
Member Author

@jmunroe, I believe you mentioned that there were some docs about this from the project pythia side 🤔? Could you please share those here so we can have them as a reference when we start working on this on Monday?

@sgibson91
Copy link
Member

I interpreted James' comment as "Project Pythia followed JetStream's docs and they got through it fine", just to give a different interpretation. May be wrong though!

@GeorgianaElena
Copy link
Member Author

@sgibson91 makes sense. My understanding was that they experimented with it documented the whole process 🤷‍♀
Putting the jetstream getting started docs in the top comment to have it as a reference

@jmunroe
Copy link
Contributor

jmunroe commented Jan 10, 2025

My assumption is that these are the correct starting points:

There have been previous uses of kubernetes on JetStream2 such as kubespray (docs) but I had understood those to be fairly manual, non-scalable ways of deploying a cluster.

What my ask of the JetStream2 team has been a 'managed kubernetes service' that we can build on top of. I think OpenStack Magnum and ClusterAPI are some of the enabling technologies used by the JetStream2 team but I am not entirely up to speed on the details.

My primary contact at JetStream has been Julian Pistorius ([email protected]). Julian is already on our 2i2c slack.

@GeorgianaElena
Copy link
Member Author

Current state

@sgibson91 and I started deploying a cluster today to the new allocation that @jmunroe created for us.

The process currently fails with CREATE_FAILED:

status_reason  | Failed to create trustee or trust for Cluster

While investigating this, we realized that we don't have access to run commands such as:

  • openstack user show trustee_domain_admin

  • openstack service list because of lack of credentials

  • ForbiddenException: 403: Client Error for url: https://js2.jetstream-cloud.org:5000/v3/users?name=trustee_domain_admin

  • ForbiddenException: 403: Client Error for url: https://js2.jetstream-cloud.org:5000/v3/services, You are not authorized to perform the requested action: identity:list_services.

What we've tried

We tried to create new application credentials that would be more permissive and give it the Unrestricted dangerous option hoping it will give us the right to run the create cluster command, but it didn't work.

The only available roles for us are the ones in the screenshot below:

Image

From the blog post we are following it looks like the loadbalancer role should fix it but that's not available. Also, given that we're not able to list resources either like services, we might need more than that, such as access to the identity API endpoint.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 15, 2025

I've emailed Julian to seek additional guidance (See https://2i2c.freshdesk.com/a/tickets/2690).

@jmunroe
Copy link
Contributor

jmunroe commented Jan 15, 2025

Potentially relevant Jetstream2 issues:

@GeorgianaElena
Copy link
Member Author

GeorgianaElena commented Jan 16, 2025

Thank you @jmunroe! I also found some interesting docs about:
- the openstack identity service https://docs.openstack.org/mitaka/install-guide-obs/common/get_started_identity.html that might help us
- magnum specific identity service (keystone): https://docs.openstack.org/magnum/latest/user/#keystone-authn-and-authz

Nvm, we still need permissions to the identity endpoint!

@GeorgianaElena
Copy link
Member Author

Update:

@sgibson91 and I opened a ticket on JetStream2 at https://jetstream-cloud.org/contact/index.html asking for guidance about this permission error. Confirmation email https://2i2c.freshdesk.com/a/tickets/2691

@jmunroe
Copy link
Contributor

jmunroe commented Jan 16, 2025

I've had success creating a Kubernetes cluster using Magnum following Andrea Zonca's blog post. When I was create the application credentials I did select the 'unrestricted (dangerous)' option. I still can't run all openstack commands (like openstack service list) but it does not seem like those are actual blockers to deploying a scalable kubernetes cluster.

I did need to be patient though. It took 117 minutes for the cluster to be created while in zonca's post he timed it at 9 minutes. Perhaps now that images have been copied over, will it run faster?

I think we should increate the quota on the number of Volume that openstack allows. I'll submit that support ticket to the Jetstream2 team.

@GeorgianaElena Please let me know if you'd like to meet tomorrow so we can make sure you are able to do what appears to work for me. I'll grab an early morning slot on your calendar.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 16, 2025

I requested an increase from 10 Volumes to 30 Volumes through the JS2 help desk.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 16, 2025

Quota for Volumes now set to 30.

I deleted my test k8s_jmunroe cluster and I am trying again to create a cluster called k8s

My hope is that it will be faster than 117 minutes this time but it is currently at 25 minutes and still pending. I'll report back how long it actually takes.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 16, 2025

I'm happy to see that the JS2 support desk was able to respond and take action on my request in <60 minutes!

@jmunroe
Copy link
Contributor

jmunroe commented Jan 17, 2025

Unfortunately, attempt 2 was not successful. The cluster creation was stuck in a 'CREATE_IN_PROGRESS' state. It appears that a control plan node get created persists for about 60 minutes, is killed, then is recreated.

Behind the scenes, my understanding of openstack coe create cluster is that Magnum uses the orchestration service Heat to build the cluster.

Using openstack stack list is supposed to allow us to see the Heat orchestration doing its thing but Magnum first creates its own application credentials and I am guess the stack is actually owned by those credentials and not mine.

I gave a few attempts at adding a private ssh keypair and create security groups to attempt to log in to the control plane to poke around and try and find some logs but I don't think I was setting up the openstack networking correctly. I had a floating IP assigned and the SSH port was open but no luck.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 17, 2025

A key blog post for understanding what actually has been set up in Jetstream2 is https://stackhpc.com/magnum-clusterapi.html . Importantly, references to 'Heat templates' is the older way Magnum was used to deploy Kubernetes.

Magnum Cluster API Helm Driver docs

The newer way (which will be easier to maintain) is to use a Magnum driver for ClusterAPI. One approach is to deploy a 'management cluster' with ClusterAPI (somewhere, doesn't actually need to be on Jetstream2 itself) and then use that management cluster to launch new workload kubernetes clusters.

My big assumption is that this management cluster is something we as users of Jetstream2 don't actually have to deploy ourselves but somehow openstack coe cluster create uses that management cluster to do the work of launching and maintaining the kubernetes workload cluster.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 20, 2025

See notes for a summary of lessons learned from this spike around deploying Kubernetes clusters on Jetstream2 using Magnum with the ClusterAPI driver.

Before moving on to deploying JupyterHub, I think we can identify the following 'next steps' that we need to resolve:

  • Deploying a cluster with openstack coe cluster create sometimes works or after a several hour delay. We need a better understanding on what is blocking this deployment and we need better tooling for tracking the deployment of a cluster. These additional tools may require access to the CAPI management cluster on JetStream2. We need guidance from the Jetstream2 team
  • We have been able to successfully create a Kubernetes cluster but we need to be able to share access to the cluster between team members at 2i2c. Currently, only the credentials that created the cluster appear to be able to download the KUBECONFIG to be able to administer the cluster.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 20, 2025

I've been pleased with the progress we've made on this iteration. I especially appreciate the mental model of Jetstream2/Openstack/Magnum/ClusterAPI as components we are building up. I think we should close this current issue and identify the specific blockers and add those to the next iteration to resolve. Possible next tasks:

Consistently deploy kubernetes cluster with openstack coe cluster create

Deploying a cluster with openstack coe cluster create sometimes works or after a several hours delay. We need a better understanding on what is blocking this deployment and we need better tooling for tracking the deployment of a cluster. These additional tools may require access to the CAPI management cluster on JetStream2. Seek guidance from the Jetstream2 team

Share kubeconfig credentials between admins of openstack clusters

We have been able to successfully create a Kubernetes cluster but we need to be able to share access to the cluster between team members at 2i2c. Currently, only the credentials that created the cluster appear to be able to download the KUBECONFIG to be able to administer the cluster. Example: If k8s is a cluster that you didn't create, then:
$ openstack coe cluster config k8s
Policy doesn't allow certificate:get to be performed (HTTP 403) (Request-ID: req-8c837745-d2d8-440a-8495-059c98bb646f)
Is there a policy configuration that needs to be done by the Jetstream2 team?

Configure Magnum UI plugin for Horizon interface

Magnum provides a Horizon plugin so that users can access the Container Infrastructure Management service through the OpenStack browser-based graphical UI. The plugin is available from magnum-ui. It is not installed by default in the standard Horizon service, but you can follow the instruction for installing a Horizon plugin. See
https://docs.openstack.org/magnum/latest/user/#horizon-interface . Request Jetstream2 deploy this plugin.

SSH key pair access to Kubernetes control plane nodes

We expect that adding a SSH keypair when creating a cluster, this should allow us to SSH directly into control plane node. We have explored adding nodes to Security Groups and opening floating IPs. We also have tried openstack server ssh. What is required to SSH into specific node? SSH is a tool that might be useful in debugging kubernetes cluster.

@sgibson91 and @GeorgianaElena are there other specific blockers we could define as issues for the next iteration?

@GeorgianaElena
Copy link
Member Author

@jmunroe, I have played around today with the cluster that magically appeared as ready over the weekend. I've put my findings at https://hackmd.io/DzgY3PW4TUOMAmpqvgOdZw?view#Experimenting-with-a-running-healthy-cluster. But to summarize:

SSH key pair access to Kubernetes control plane nodes

This worked (kind of). If I would have created the cluster with the --keypair option, then I would have had access. I didn't find a way around this (updating metadata or logging into the instance with generic usernames like ubuntu or root to then add this key, failed).

So the fact that this doesn't work on control plane nodes part of clusters stuck in CREATE_IN_PROGRESS might mean that these nodes even though appear as ACTIVE might actually miss a few things.

kubectl access

I still couldn't get access into the cluster with kubectl. For that matter, pinging the public IP of the cluster fails. There are firewall rules permitting access, but it still doesn't work.

I saw something weird while investing the network setup, which is that the loadbalancers that get created with the cluster appear as OFFLINE and there's no way to manually get them up. I believe they are control plane loadbalancers which make sense to not be setup for cases when there's only one control plane node.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 20, 2025

I think we can infer that that first control plane node is being created but is not coming up fully (which is why it is killed and restarted after 60 minutes).

I think this is what the cloud init procedure is running.

I wonder if I just spin up a instance of the image ubuntu-jammy-kube-v1.30.4-240828-1653 by itself if that would give us some insight. Trying that now.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 20, 2025

No great insights. It appears starting a single instance of ubuntu-jammy-kube-v1.30.4-240828-1653 works fine and I am able to ssh into it.

@julianpistorius
Copy link

Hello @jmunroe, @GeorgianaElena, and @sgibson91.

I'm catching up with this issue and the related ACCESS support ticket. It looks like you have a good grasp of the process for creating clusters, and have had some success.

My big assumption is that this management cluster is something we as users of Jetstream2 don't actually have to deploy ourselves but somehow openstack coe cluster create uses that management cluster to do the work of launching and maintaining the kubernetes workload cluster.

This assumption is indeed correct. There is a single Cluster API management cluster for all workload clusters created with Magnum/openstack coe cluster create.

@julianpistorius
Copy link

Hello @GeorgianaElena & @jmunroe,

As I wrote in the support ticket:

Our engineers have identified and remediated a networking problem which were affecting creating new clusters in projects like yours. Could you please try to create a few test clusters and let us know how it goes?

I’ll talk to the team and get back to you about debugging strategies.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 21, 2025

Yep! That definitely fixed something. 10 minutes to get to

$ kubectl get nodes -A
NAME                                                  STATUS   ROLES           AGE     VERSION
k8s-jmunroe-4iangcz2e3wb-control-plane-k4mjp          Ready    control-plane   5m30s   v1.30.4
k8s-jmunroe-4iangcz2e3wb-control-plane-rchd2          Ready    control-plane   8m7s    v1.30.4
k8s-jmunroe-4iangcz2e3wb-control-plane-w25mk          Ready    control-plane   3m59s   v1.30.4
k8s-jmunroe-4iangcz2e3wb-default-worker-xqrt2-2btgf   Ready    <none>          5m26s   v1.30.4

@GeorgianaElena could you try creating some kubernetes clusters to confirm it works robustly?

@yuvipanda yuvipanda changed the title [2 days] Manually create a cluster (no terraform) Manually create a cluster (no terraform) Jan 22, 2025
@sgibson91
Copy link
Member

sgibson91 commented Jan 22, 2025

My two clusters that were creating since Monday have now completed.

I managed to get the kubeconfig file and run kubectl against the cluster

❯ openstack coe cluster config $K8S_CLUSTER_NAME
export KUBECONFIG=/Users/sgibson/source/github/jupyterhub-deploy-kubernetes-jetstream/kubernetes_magnum/config

❯ k get nodes --kubeconfig ./config
NAME                                                       STATUS   ROLES           AGE   VERSION
sgibson-test-ssh-gnwcldiv73i5-control-plane-9zr6h          Ready    control-plane   16h   v1.30.4
sgibson-test-ssh-gnwcldiv73i5-default-worker-r4kfm-2lr8r   Ready    <none>          16h   v1.30.4

I also verified that exporting the kubeconfig file and using it with the deployer also permits kubectl access: #5357 (comment)

@jmunroe
Copy link
Contributor

jmunroe commented Jan 22, 2025

@GeorgianaElena and I have verified that if we use the same Application Credentials the we can manage each others kubernetes clusters.

I've shared a something for the CIS250031 allocation with the 2i2c team through Bitwarden. I think that is "secure" but I'll take guidance if there is better way of sharing those credentials between members of the engineering team. I think we already do something like this to allow us to use the deployer command but I don't know the details of how that actually is set up.

@GeorgianaElena
Copy link
Member Author

GeorgianaElena commented Jan 22, 2025

I think we already do something like this to allow us to use the deployer command but I don't know the details of how that actually is set up.

I believe we can store the credentials encrypted just like we do with the actual kubeconfig and then tweak the deployer to use those credentials to authenticate before using the kubeconfig.

@GeorgianaElena could you try creating some kubernetes clusters to confirm it works robustly?

Yes, I created two clusters today which took about 10min, so all good now 🎉 . Thank you!

@GeorgianaElena GeorgianaElena assigned jmunroe and unassigned sgibson91 Jan 22, 2025
@jmunroe
Copy link
Contributor

jmunroe commented Jan 23, 2025

It appears that autoscaling and manually scaling of the kubernetes cluster is not working well together. If I deploy a cluster and omit the lines relating to autoscaling:

https://github.com/zonca/jupyterhub-deploy-kubernetes-jetstream/blob/master/kubernetes_magnum/create_cluster.sh#L21-L23

then I can successfully manual upscale and downscale the kubernetes cluster using

openstack coe cluster resize --nodegroup default-worker $K8S_CLUSTER_NAME {num-nodes}

(Reference: zonca/zonca.dev#9 (comment))

@yuvipanda
Copy link
Member

If it's one or the other, autoscaling is what needs to work. Basically I want us to be able to trust that nodes will be able to come and go.

I think a simple way to test this for autoscaling is to create a deployment object that's basic, and try to increase the number of replicas it has. This should trigger new nodes. And then if you reduce it, it should clean up nodes. repeat until you can see at least nodes come and go 3 times.

@jmunroe
Copy link
Contributor

jmunroe commented Jan 24, 2025

I've verified that autoscaling up and then back down has worked (following https://satishdotpatel.github.io/kubernetes-cluster-autoscaler-with-magnum-capi/):

  1. Created an autoscaling k8s cluster:

openstack coe cluster create k8s --cluster-template kubernetes-1-31-jammy --node-count 1 --master-count 1 --labels auto_scaling_enabled=true,min_node_count=1,max_node_count=3

  1. Deploy a sample application (with 2 replicas):

kubectl create -f https://k8s.io/examples/application/deployment.yaml

  1. Scale up the number of replicas

kubectl scale deployment --replicas=120 nginx-deployment

  1. Observed that only 96 pods started right away, the remainder were pending.

  2. Autoscaler triggered the adding another node to the workload cluster

  3. Observed that all 120 replicas were running

  4. Scale down the number of replicas

kubectl scale deployment --replicas=20 nginx-deployment

  1. Observed remaining pods were allocated on to one worker node

  2. After about 10 minutes, autoscaler triggered removing the extra needed node

Comment about openstack's nodegroups

I was watching the output of

openstack coe nodegroup list

during this experiment. I was expecting to see the node_count for the default-worker group go from 1 to 2 then back down to 1. that was not observed.

$ openstack coe nodegroup show k8s default-worker
+--------------------+--------------------------------------------------------------------------------+
| Field              | Value                                                                          |
+--------------------+--------------------------------------------------------------------------------+
| uuid               | cc886e66-f930-4549-8e86-f3ed18a16994                                           |
| name               | default-worker                                                                 |
| cluster_id         | 4ed10a85-4e7d-4752-8a14-c404b5054d51                                           |
| project_id         | 390542082bd74fa6abcde82f8c7ded89                                               |
| docker_volume_size | None                                                                           |
| labels             | {'auto_scaling_enabled': 'true', 'min_node_count': '1', 'max_node_count3': ''} |
| labels_overridden  | {}                                                                             |
| labels_skipped     | {}                                                                             |
| labels_added       | {}                                                                             |
| flavor_id          | m3.small                                                                       |
| image_id           | ubuntu-jammy-kube-v1.31.0-240828-1652                                          |
| node_addresses     | []                                                                             |
| node_count         | 1                                                                              |
| role               | worker                                                                         |
| max_node_count     | None                                                                           |
| min_node_count     | 1                                                                              |
| is_default         | True                                                                           |
| stack_id           | k8s-qlwp2ttlg4e7                                                               |
| status             | CREATE_COMPLETE                                                                |
| status_reason      | None                                                                           |
+--------------------+--------------------------------------------------------------------------------+

My working assumption is the the fields node_count, max_node_count, and min_node_count are NOT used by the new ClusterAPI backed Magnum driver. The autoscaling is using the the labels: {'auto_scaling_enabled': 'true', 'min_node_count': '1', 'max_node_count3': ''} .

@julianpistorius
Copy link

Yes, manual scaling & auto-scaling are mutually exclusive.

@GeorgianaElena
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants