Skip to content

[ARO-9395] FixSSH refactoring for CPMS enablement #4177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kevinobriendotca
Copy link
Collaborator

Which issue this PR addresses:

Fixes https://issues.redhat.com/browse/ARO-9395

What this PR does / why we need it:

Running a CPMS update of a cluster breaks SSH connectivity to the master nodes. This PR refactors the FixSSH admin update step to add the master NICs to the correct SSH backend pools after the update. In addition, we are adding the capability to find and delete any orphaned NICs from the previous master nodes.

There are some differences in how the NICs for a public vs private cluster are updated by the machine API. For public clusters, the new NICs are not added to the SSH backend pools at all. For the private clusters, all 3 new NICs are added to the ssh-0 backend address pool. The refactoring covers both cases.

Test plan for issue:

The changes have been tested against private and public dev clusters. This branch will be tested in canary against prod public and private clusters as well.

make test-go is passing all tests.

Is there any documentation that needs to be updated for this PR?

N/A

How do you know this will function as expected in production?

Production testing in the canary environment.

- Refactoring fixssh to work with CPMS.
- Upgraded armnetwork package to v6.2.0 to get azfake.PagerResponder[armnetwork.InterfacesClientListResponse] for mocking InterfacesClient.NewListPager.
- Refactoring fixssh tests to use mocked pager responses for various cases.
- Imports auto-updated and armnetwork mocks regenerated.
- Refactored fixssh_test to move client option for mock interface server into function to remove duplicate code and some cleanup.
- Minor refactoring for fixssh.go.
Copy link
Collaborator

@chreed-rh chreed-rh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks extensive, nothing that leaps out at me red flag wise. Would be a good one to go through code review wise IMO

armnetwork "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork/v2"
armnetwork0 "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork/v6"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we loading in two version of the armnetwork library?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did test upgrading the armnetwork package from v2 to v6 everywhere in the code base and didn't hit any issues, but figured that is a change that might be better in a separate PR. I'm pretty sure it's fully backward compatible.

The main reason for using the newer armnetwork v6 package is to get the fake server implementations, which make it easier to mock the InterfacesClient interface. The fake servers allow us to return custom responses to API requests with the same behaviour as the real Azure APIs.

"github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork/v6"
fake_armnetwork "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/network/armnetwork/v6/fake"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the fake_armnetwork section refer to what that library will be referenced as in code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's an import alias to the fake package name, which is just "fake". Using an alias makes it clear which fake package we're referring to.

func (m *manager) checkAndUpdateNICsInResourceGroup(ctx context.Context, resourceGroup string, infraID string, lb *mgmtnetwork.LoadBalancer) (err error) {
masterSubnetID := m.doc.OpenShiftCluster.Properties.MasterProfile.SubnetID
pagerOpts := &armnetwork.InterfacesClientListOptions{}
pager := m.armInterfaces.NewListPager(resourceGroup, pagerOpts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have traditionally constructed wrappers around clients which will fetch all the resources rather than paging through them.

I don't know if we're doing this today still, but can/should we add an Addon to the interfaces client we have today?

Something like this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10-4. I'll refactor that bit and add the wrapper as per the example you linked to.

- Adding new InterfacesClient.List wrapper method InterfacesClient.NewListPager and refactored fixssh code to use that.
- Refactored fixssh mocks to return lists of interfaces directly instead of via mocked pager.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants