Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure for MacOS 13.x #3240

Closed
UlisesGascon opened this issue Mar 18, 2023 · 25 comments
Closed

Infrastructure for MacOS 13.x #3240

UlisesGascon opened this issue Mar 18, 2023 · 25 comments
Assignees

Comments

@UlisesGascon
Copy link
Member

It would be beneficial to begin brainstorming ideas on how we can incorporate MacOS 12.x and 13.x into our current infrastructure. I anticipate that we will need to make the machines accessible for both Intel and ARM devices.

To my recollection, we can utilize some of the existing resources in Orka for the Intel machines with our current configuration. However, to accommodate ARM machines, we may require additional infrastructure resources.

What is your opinion, @nodejs/build? Should we discuss this matter in our next meeting?

Issues related:

@UlisesGascon
Copy link
Member Author

I was checking the availability for Orka resources, and we have some empty spots in the Node macpro-4, as well we can potentially use port 8825 in all the nodes. In average each VM is configured to use 4 vCPU/CPU and 9.40G RAM.

Current Status in Orka

VMs distribution

You can check this in Orka by running orka vm list

SSH port Node: macpro-4 Node: macpro-5 Node: macpro-6
8822 release-macos11-x64-1 test-macos1014-x64-1 test-macos11-x64-1
8823 not used test-macos1014-x64-2 test-macos11-x64-2
8824 not used test-macos1015-x64-2 test-macos1015-x64-1
8825 not used release-macos1015-x64-1 macos1014-x64-3

Note that release-macos11-x64-1 is not in the inventory as the PR still open #3185

The NAT

The NAT is quite tricky to process, but let me summarize it from #3112 (comment)

Node Internal Ip External Ip
macpro-4 10.221.188.14 199.7.167.100
macpro-5 10.221.188.15 199.7.167.101
macpro-6 10.221.188.16 199.7.167.102

Nodes Physical resources distribution

You can check this in Orka by running orka node list

Node CPU Memory
macpro-4 20 / 24 52.59G / 62.78G
macpro-5 8 / 24 24.10G / 62.78G
macpro-6 8 / 24 24.10G / 62.78G

Future distribution (proposal)

We might want to add at least 2 vms per MacOS test version, and 1 for release:

  • test-macos12-x64-1, test-macos12-x64-2 and release-macos12-x64-1
  • test-macos13-x64-1, test-macos13-x64-2 and release-macos13-x64-1

Resources Distribution

SSH port Node: macpro-4 Node: macpro-5 Node: macpro-6
8822 release-macos11-x64-1 test-macos1014-x64-1 test-macos11-x64-1
8823 test-macos12-x64-1 test-macos1014-x64-2 test-macos11-x64-2
8824 test-macos13-x64-1 test-macos1015-x64-2 test-macos1015-x64-1
8825 test-macos13-x64-2 release-macos1015-x64-1 macos1014-x64-3
8826 release-macos13-x64-1 test-macos12-x64-2 release-macos12-x64-1

Nodes Physical resources availability

Node CPU Memory
macpro-4 4 / 24 15.78G / 62.78G
macpro-5 4 / 24 15.78G / 62.78G
macpro-6 4 / 24 15.78G / 62.78G

We will still have room for another 3 vms if we want to squeeze the nodes.

@mhdawson
Copy link
Member

Getting some new machines into the mix for 20.x makes a lot of sense to me. I think we usualy only release using one version so in terms of release I think should likely assess if we can move up to 12 or 13 for doing releases, but I don't think we'll have resources for both (particularly on the arm side)

@UlisesGascon
Copy link
Member Author

Possible Solution for ARM machines in Macstadium

Seems like since Orka 2.0 we can add ARM based nodes to the existing intel Orka Cluster.

Starting with Orka 2.0, it is now possible to deploy an Orka cluster of Apple silicon-based nodes (also: Apple ARM-based nodes) and run your CI/CD workflows there. The Apple silicon-based nodes can also be added to an existing cluster with Intel-based nodes. Then, it will have both Intel- and Apple silicon-based nodes. A prerequisite for that is to upgrade to Orka 2.0 before adding the new Apple silicon-based nodes. See

Currently, we use Bare Metal solution in Macstadium for ARM:

  • test-macos11.0-arm64-4: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G)
  • test-macos11.0-arm64-3: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G)
  • release-macos11.0-arm64-1: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G)

And I believe that the machines are bigger than our needs, but if we convert this resources into Orka Nodes. For Intel we use Orka C1 Build (24 vCPU + 60 GB) Node, so I assume that for ARM Nodes the resources are similar. Then we can extend the current Orka by converting the existing Bare Metal Resources and add 12.x and 13.x for ARM specific:

Resources Distribution

SSH port Node: macpro-4 Node: macpro-5 Node: macpro-6 Node: ARM-1 Node: ARM-2
8822 release-macos11-x64-1 test-macos1014-x64-1 test-macos11-x64-1 test-macos11.0-arm64-3 test-macos11.0-arm64-4
8823 test-macos12-x64-1 test-macos1014-x64-2 test-macos11-x64-2 release-macos11.0-arm64-1 release-macos12.0-arm64-1
8824 test-macos13-x64-1 test-macos1015-x64-2 test-macos1015-x64-1 release-macos13.0-arm64-1 test-macos12.0-arm64-1
8825 test-macos13-x64-2 release-macos1015-x64-1 macos1014-x64-3 test-macos12.0-arm64-2 test-macos13.0-arm64-1
8826 release-macos13-x64-1 test-macos12-x64-2 release-macos12-x64-1 test-macos13.0-arm64-2 empty

Obviously we need to check with Macstadium if this is possible. I think that we are in Orka v2.0 or higher based on #3112 (comment)

We can check also with Nearform if we can extend the current nodes in their infra to support 12.x and 13.x

@AshCripps
Copy link
Member

Our Orka cluster is 2.3.1 according to the support ticket when they upgraded.

@mhdawson
Copy link
Member

Using orka makes sense to me. Maybe the next stop is to open a ticket with Mac Stadium asking if we can get an ARM based machine added to our free account?

@UlisesGascon
Copy link
Member Author

I will start working in test-macos12-x64-1 and test-macos13-x64-1 (REF: #3240 (comment)) so I can update the Ansible scripts and create the base VMs in Orka. I will need some support to add the new machines to Jekins as I am not an admin

Our Orka cluster is 2.3.1 according to the support ticket when they upgraded.

Thanks @AshCripps!

Using orka makes sense to me. Maybe the next stop is to open a ticket with Mac Stadium asking if we can get an ARM based machine added to our free account?

@mhdawson do you want me to contact support?

UlisesGascon added a commit to UlisesGascon/build that referenced this issue Mar 25, 2023
@UlisesGascon
Copy link
Member Author

I will start working in test-macos12-x64-1 and test-macos13-x64-1 (REF: #3240 (comment)) so I can update the Ansible scripts and create the base VMs in Orka. I will need some support to add the new machines to Jekins as I am not an admin

Current progress

I created the VM test-macos12-x64-1 from scratch using the official images provided by Orka. This VM is using image macos12-x64 that was a clean installation with a password change only to match the setup instruction in the private repo. The base image macos12-x64 can be used to create release or test VMs as there are no ssh key associated. This #3258 will include the test-macos12-x64-1 to the inventory. Next step will the to Ansible the machine and create the tags and setting in Jenkins.

Regarding test-macos13-x64-1 I had some issues with the startup of the Operative system using the Orka Ventura official image and also by creating a new VM based on macos12-x64 with manual upgrade to Ventura. I created this ticket asking for support in MacStadium.

@mhdawson
Copy link
Member

@mhdawson do you want me to contact support?

sure

@UlisesGascon
Copy link
Member Author

Regarding test-macos13-x64-1 I had some issues with the startup of the Operative system using the Orka Ventura official image and also by creating a new VM based on macos12-x64 with manual upgrade to Ventura. I created this ticket asking for support in MacStadium.

They confirmed that the current Intel Nodes (Macpro) are not supported for Ventura in this environment due Apple restrictions, so I am checking if we can use different Node type to run Ventura in Intel Arch with the current Orka settings, as well I am validating with them also if the ARM Nodes can run Ventura and Monterey

@UlisesGascon
Copy link
Member Author

They confirmed that the current Intel Nodes (Macpro) are not supported for Ventura in this environment due Apple restrictions, so I am checking if we can use different Node type to run Ventura in Intel Arch with the current Orka settings, as well I am validating with them also if the ARM Nodes can run Ventura and Monterey

If we migrate the existing nodes to mac minis then we can add Ventura VMs in Intel Arch. I am checking the specs with Support so I can update the Resources Distribution table. Venture is full supported in Mac Mini ARM too.

⚠️ Important

I asume that we will have some downtime period while migrating between nodes as we will need to redeploy the existing VMs in the correct spots inside the New nodes to keep the same IPs and ports. I assume this will be very similar to #3112. We can plan it ahead and try to minimize the impact.

In the meantime we can continue working on adding Macos 12.x vms to the existing resources, so we are not blocked before the nodes migration.

As a potential date for this migration, I propose April 15-16 as it is weekend and I will be more available so I can deep focus on redeploying the VMs and solve any additional issue that might appear.

I assume that the builds will continue working with the available resources in Nearform, so we might slowdown a bit the pipelines during the migration but we will not block the builds 🤞

@UlisesGascon
Copy link
Member Author

TL;DR:

No changes have been made to any machine in the inventory yet. I have made some changes to the distribution of resources and node ideas based on feedback from customer support. The discussion with customer support is now complete, and we can move on to the next steps. Please refer to the "Next Steps" heading at the bottom.

Current Status

Response from Macstadium support:

We do not recommend any specific VM configuration as each client, build, or job may require different resources to varying degrees. That being said Apple’s Hypervisor Framework (HVF) does not allow more than 2 VMs to be deployed on a single M1 or M2 device at a time. With that in mind, you could run 2 VMs with 4 CPU cores and 8 GB of RAM apiece, and you would most likely see better build times than on your Mac Pros. The one caveat to this is that if your build is very I\O intensive then you may experience longer build times than on your Mac Pros (this is related to how HVF utilizes virtual storage, we are investigating possible ways to alleviate this bottleneck).

The only hardware that we have that can’t support Ventura is the Mac Pro, so any other device will support it. Unfortunately, I don’t have the specs of each of the machines that we offer, but our Customer Success team should be able to supply that information for you.

I closed the support ticket SERVICE-157786. The recommendation from support team will be to discuss with the customer success team (not sure if we have a direct contact already, we can check it?), we can do that as a separate action before we start making changes in the nodes.

Based on the last discussions with the support, most likely we can face this scenario with some minor changes in terms of Orka nodes or VM distribution inside the nodes:

Macstadium resources:

  • test-macos11.0-arm64-4: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G). Delete resource (VM will replace it in Orka)
  • test-macos11.0-arm64-3: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G). Delete resource (VM will replace it in Orka)
  • release-macos11.0-arm64-1: Mac mini G5A (AS/M1/8C/8G/256G/SSD/1G). Delete resource (VM will replace it in Orka)
  • ✅ Orka Node: macpro-4 (24 vCPU/60G). Reminds
  • ✅ Orka Node: macpro-5 (24 vCPU/60G). Reminds
  • ✅ Orka Node: macpro-6 (24 vCPU/60G). Reminds
  • 🆕 Orka Node: macmini-intel-1 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for Macos13 Intel VMs
  • 🆕 Orka Node: macmini-intel-1 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for Macos13 Intel VMs
  • 🆕 Orka Node: macmini-arm-1 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for ARM support (11.x,12.x,13.x)
  • 🆕 Orka Node: macmini-arm-2 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for ARM support (11.x,12.x,13.x)
  • 🆕 Orka Node: macmini-arm-3 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for ARM support (11.x,12.x,13.x)
  • 🆕 Orka Node: macmini-arm-4 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for ARM support (11.x,12.x,13.x)
  • 🆕 Orka Node: macmini-arm-5 (TBC: 8vCPU/16GB/~900G NVMe SSD). New addition for ARM support (11.x,12.x,13.x)

The NAT

Node Internal Ip External Ip
macpro-4 10.221.188.14 199.7.167.100
macpro-5 10.221.188.15 199.7.167.101
macpro-6 10.221.188.16 199.7.167.102
macmini-intel-1 10.221.188.17 199.7.167.103
macmini-intel-2 10.221.188.18 199.7.167.104
macmini-arm-1 10.221.188.19 199.7.167.105
macmini-arm-2 10.221.188.20 199.7.167.106
macmini-arm-3 10.221.188.21 199.7.167.107
macmini-arm-4 10.221.188.22 199.7.167.108
macmini-arm-5 10.221.188.23 199.7.167.109

Resources Distribution

SSH port Node: macpro-4 Node: macpro-5 Node: macpro-6 Node: macmini-intel-1 Node: macmini-intel-2 Node: macmini-arm-1 Node: macmini-arm-2 Node: macmini-arm-3 Node: macmini-arm-4 Node: macmini-arm-5
8822 release-macos11-x64-1 test-macos1014-x64-1 test-macos11-x64-1 test-macos13-x64-1 test-macos13-x64-2 test-macos11-arm64-3 test-macos12-arm64-1 test-macos13-arm64-1 test-macos13-arm64-2 test-macos11-arm64-4
8823 test-macos12-x64-1 test-macos1014-x64-2 test-macos11-x64-2 release-macos13-x64-1 empty release-macos11-arm64-1 test-macos12-arm64-2 release-macos12-arm64-1 release-macos13-arm64-1 empty
8824 test-macos12-x64-2 test-macos1015-x64-2 test-macos1015-x64-1 N/A N/A N/A N/A N/A N/A N/A
8825 release-macos12-x64-1 release-macos1015-x64-1 macos1014-x64-3 N/A N/A N/A N/A N/A N/A N/A
8826 empty empty empty N/A N/A N/A N/A N/A N/A N/A

Opportunity
The port 8826 won't be used for macpro-*, but it can be used if needed.

The impact

With this setup the impact in machines availability is lower than in previous comments, but still an impact.

Future notes

When we remove the support for macos10.14, we will be able to move the VMs from macpro-4 to macpro-5 and macpro-6, so we can decomission the macpro-4 node.

VM Resources

  • VMs under macpro nodes will have 4 vCPU/CPU, 9.40G RAM and 90GB Disk space
  • ⚠️ VMs under macmini-* nodes will have 4 vCPU/CPU, 8G RAM and 400G Disk space
  • We have some empty spots that we can use for other purposes

Next steps

Important: Do we have any critical deadline? This can impact in Node 20 release?

  • Find an agreement on this plan inside the @nodejs/build team
  • Contact Customer Success Team to fine tune this plan and get an approval from Macstadium in the potential resources changes.
  • Continue working on preparing the Macos12 images inside Orka following the distribution schema for the macpro-* nodes (@UlisesGascon will lead this part)

@mhdawson
Copy link
Member

mhdawson commented Apr 4, 2023

I think to change hw we'll need to get the ok that it will be covered under the free sponsorship that we get. @UlisesGascon I assume we've not discussed that yet. We'll need to figure out who the right contact might be, maybe from the ticket where they helped remove the charges from creating new machines by accident?

@UlisesGascon
Copy link
Member Author

We made a good progress in #3299. 🎉

I will add a new estimation on how we can manage the current Orka / Bare Metal resources in order to add support for 12.x (and ignore 13.x for now) and use the empty slots generated after the drop support for 10.14 (#3087 (comment)) in the following days.

@UlisesGascon
Copy link
Member Author

After the progress made in #3087, the current VM distribution in Orka is:

Resources Distribution

SSH port Node: macpro-4 Node: macpro-5 Node: macpro-6
8822 release-macos11-x64-1 empty test-macos11-x64-1
8823 test-macos12-x64-1 empty test-macos11-x64-2
8824 empty test-macos1015-x64-2 test-macos1015-x64-1
8825 empty release-macos1015-x64-1 empty

@mhdawson
Copy link
Member

mhdawson commented May 4, 2023

@UlisesGascon great to see the progress being made.

UlisesGascon added a commit that referenced this issue May 8, 2023
@UlisesGascon
Copy link
Member Author

Machine test-macos12-x64-1 is now accesible via SSH as a regular orka-test machine. I will need to work in the Ansible script to make it compatible for macos12.

Captura de pantalla 2023-05-08 a las 22 44 09

I will create a separate PR to update the secrets inventory.yml once I am clear on the final Ansible settings.

@UlisesGascon
Copy link
Member Author

As discussed in our last meeting (#3604) we will skip macos 12.x and jump to macos 13.x

@UlisesGascon
Copy link
Member Author

I created the first VM with Ventura (macos13) image and I am getting issues to login via VNC, so I opened the ticket SERVICE-177494.

@UlisesGascon
Copy link
Member Author

The login problem was solved already

@targos
Copy link
Member

targos commented Apr 16, 2024

What's the status? Can I do anything to help move this forward?

@targos
Copy link
Member

targos commented Apr 16, 2024

I'm asking because osx11-x64 is a problematic bottleneck with only 2 machines and often space issues (currently 1 machine is offline)

@UlisesGascon

This comment was marked as outdated.

@UlisesGascon UlisesGascon changed the title Infrastructure for MacOS 13.x Infrastructure for Orka (2024 and beyond) Apr 19, 2024
@UlisesGascon UlisesGascon changed the title Infrastructure for Orka (2024 and beyond) Infrastructure for MacOS 13.x Apr 19, 2024
@UlisesGascon
Copy link
Member Author

UlisesGascon commented Apr 19, 2024

I close this is favor of #3686 that centralize the Orka changes including MacOS 13 addition in Orka. Feel free to reopen this issue if needed.

@targos targos unpinned this issue May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants