Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNF-13652: Test integration of SRIOV and NMState operators #1990

Merged
merged 4 commits into from
Oct 23, 2024

Conversation

zeeke
Copy link
Member

@zeeke zeeke commented Aug 8, 2024

This changes implements a simple test to verify the NMState operator does not interfere with the SR-IOV network operator.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 8, 2024
@openshift-ci-robot
Copy link
Collaborator

@zeeke: This pull request references CNF-13652 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

This changes implements a simple test to verify the NMState operator does not interfere with the SR-IOV network operator.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 8, 2024
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 8, 2024
@zeeke
Copy link
Member Author

zeeke commented Aug 27, 2024

/test ?

Copy link
Contributor

openshift-ci bot commented Aug 27, 2024

@zeeke: The following commands are available to trigger required jobs:

  • /test ci
  • /test e2e-aws-ci-tests
  • /test images
  • /test ztp-ci

The following commands are available to trigger optional jobs:

  • /test e2e-aws-ran-profile
  • /test e2e-telco5g-cnftests
  • /test e2e-telco5g-hcp-cnftests
  • /test e2e-telco5g-sno-cnftests
  • /test security

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-kni-cnf-features-deploy-master-ci
  • pull-ci-openshift-kni-cnf-features-deploy-master-e2e-aws-ci-tests
  • pull-ci-openshift-kni-cnf-features-deploy-master-e2e-aws-ran-profile
  • pull-ci-openshift-kni-cnf-features-deploy-master-images
  • pull-ci-openshift-kni-cnf-features-deploy-master-security

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@zeeke
Copy link
Member Author

zeeke commented Aug 27, 2024

/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Aug 29, 2024

/test e2e-telco5g-cnftests

@zeeke zeeke force-pushed the CNF-13652 branch 2 times, most recently from 7692f80 to bcb630c Compare August 29, 2024 11:25
@zeeke zeeke changed the title [WIP] CNF-13652: Test integration of SRIOV and NMState operators CNF-13652: Test integration of SRIOV and NMState operators Aug 29, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 29, 2024
@zeeke
Copy link
Member Author

zeeke commented Aug 29, 2024

/test e2e-telco5g-cnftests

1 similar comment
@zeeke
Copy link
Member Author

zeeke commented Aug 30, 2024

/test e2e-telco5g-cnftests

func waitForNMStatePolicyToBeStable(nmstatePolicy *nmstatev1.NodeNetworkConfigurationPolicy) {
key := runtimeclient.ObjectKey{Name: nmstatePolicy.Name}
Eventually(func(g Gomega) {
err := client.Client.Get(context.Background(), key, nmstatePolicy)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nmstatePolicy is using the arg var, which creates confusion about the scope of the variables. Maybe have a new one.

Help functions imo are easier to read/be refactored if they are not having many assertions. I always suggest the assertion to be in the main block and have function that return error/true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nmstatePolicy is using the arg var, which creates confusion about the scope of the variables. Maybe have a new one.

good point. And since the function only use the name of the policy, I refactor it to pass a string only.

Help functions imo are easier to read/be refactored if they are not having many assertions. I always suggest the assertion to be in the main block and have function that return error/true.

I'm not 100% with it: helper functions must be short (<20 lines, as a rule of thumb) and that's where they get their readability. On the other side, the call site needs to be way more readable, as it contains the complicated test logic. With the current implementation, the call site reads:

waitForNMStatePolicyToBeStable(nodeNetworkConfigPolicy.Name)

which reads straightforward to me.
With functions like these, we might lose reusability; a function that waits for different conditions is almost copy/paste, but that's another story.

testDevice := devices[0]
By("Using device " + testDevice.Name)

By("Configuring IPAM on Physical Function via NMState")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that different than "Configuring and IPv4 address"? just IPAM (Address Management) makes me think whereabouts/or other IPAM is being deployed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPv4 is definitely more clear! changing

sriovclient = sriovtestclient.New("")
}

var _ = Describe("[nmstate] SR-IOV Network Operator Integration", Ordered, func() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the ordered is needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use the BeforeAll section.
https://onsi.github.io/ginkgo/#setup-in-ordered-containers-beforeall-and-afterall

The question might be, why did I used a BeforeAll if there is only one test case? The answer is "I hope to add more test cases in the future". It might be an early optimization, WDYT?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure you can use BeforeEach, AfterEach stacked on top of Describe/Context to achieve what you need in the future.

According to the doc

Ginkgo did not include support for Ordered containers for quite some time. As you can see Ordered containers make it possible to circumvent the "Declare in container nodes, initialize in setup nodes" principle; and they make it possible to write dependent specs This comes at a cost, of course - specs in Ordered containers cannot be fully parallelized which can result in slower suite runtimes. Despite these cons, pragmatism prevailed and Ordered containers were introduced in response to real-world needs in the community. Nonetheless, we recommend using Ordered containers only when needed.

So I would say specs(tests) should be not ordered unless they really really need to be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, let's go for the BeforeEach way

resNum := testedNode.Status.Allocatable[corev1.ResourceName("openshift.io/testnmstateresource")]
capacity, _ := resNum.AsInt64()
return capacity
}, 10*time.Minute, time.Second).Should(Equal(int64(10)))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the end spec is " there should be 10 VRFs"?

I suggest to extend the description that appears if the spec fails, e.g.

Should(Equal(int64(10))," the number of VFs observes is not that same as the one sriov created")

``

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. changing

cnf-tests/testsuites/e2esuite/nmstate/nmstate_sriov.go Outdated Show resolved Hide resolved
})

func findNonPrimarySriovDevicesAndNode(n *cluster.EnabledNodes) (string, []*sriovv1.InterfaceExt, error) {
errs := []error{}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use this lib https://pkg.go.dev/errors to wrap errors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using that package at the end of the function:

if len(retDevices) == 0 {
	return "", nil, fmt.Errorf("can't find any SR-IOV devices in cluster's nodes: %w", errors.Join(errs...))
}

How would you use it with Wrap(...)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid the []array

func main() {
	var err error
    if false{
	err1 := errors.New("err1")
	err = errors.Join(err, err1)
	err2 := errors.New("err2")
	err = errors.Join(err, err2)
}
	fmt.Println(err)

}

but to be honest, both works

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going for the errors.Join(...) approach


if len(nonPrimaryDevices) > len(retDevices) {
retNode = node
retDevices = nonPrimaryDevices

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just return here in the happy path, and return the wrap errors. I might be missing if there is selection between two nodes that have sriov or not

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point: this function is not very clear. The point is to find the node with the most number of SR-IOV devices. Going to reword the variables and comment to make it clearer.

name: openshift-nmstate
spec:
finalizers:
- kubernetes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] my yamllint is complaining that it need --- and two spaces here (same for all yamls)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the linter happy 😉

@zeeke zeeke force-pushed the CNF-13652 branch 3 times, most recently from b7f7b62 to cd0875f Compare September 4, 2024 08:31
@zeeke
Copy link
Member Author

zeeke commented Sep 4, 2024

/test e2e-telco5g-cnftests

name: kubernetes-nmstate-operator
source: art-nightly-operator-catalog
sourceNamespace: openshift-marketplace
---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the --- goes to the top

yamllint  feature-configs/ci/nmstate/kustomization.yaml
feature-configs/ci/nmstate/kustomization.yaml
  1:1       warning  missing document start "---"  (document-start)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noticed all the yaml under feature-configs does not have --- . Lets make it as all the other and live with that until we decide to put yamllint in the CI. Does it sound good?

@zeeke zeeke force-pushed the CNF-13652 branch 2 times, most recently from 6b21f40 to f81a28a Compare September 4, 2024 16:12
@zeeke
Copy link
Member Author

zeeke commented Sep 4, 2024

/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Sep 4, 2024

/retest
/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Oct 7, 2024

/retest
/test e2e-telco5g-cnftests

Expect(err).ToNot(HaveOccurred())
DeferCleanup(sriovclient.Delete, context.Background(), sriovNetworkNodePolicy)

By("Verify Virtual Functions have been correctly configured")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: verifying that (same below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed!

@zeeke
Copy link
Member Author

zeeke commented Oct 11, 2024

/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Oct 11, 2024

/retest

@zeeke
Copy link
Member Author

zeeke commented Oct 14, 2024

/test e2e-telco5g-cnftests
/retest

@zeeke
Copy link
Member Author

zeeke commented Oct 15, 2024

/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Oct 16, 2024

/retest

@zeeke
Copy link
Member Author

zeeke commented Oct 16, 2024

/test e2e-telco5g-cnftests

@zeeke
Copy link
Member Author

zeeke commented Oct 16, 2024

/retest

@zeeke zeeke force-pushed the CNF-13652 branch 3 times, most recently from 602c277 to 1047a6a Compare October 23, 2024 09:09
Signed-off-by: Andrea Panattoni <[email protected]>
Add a test that configures the same NIC via Kubernetes-NMState and
SR-IOV network operator. The test ensure that if NMState doesn't
have any SR-IOV configuration, then both operator should work correctly.

Signed-off-by: Andrea Panattoni <[email protected]>
`kubernetes-nmsate` CI does not publish images on
`quay.io/openshift/*`. The best way to deploy a recent
version of the operator is to leverage the
`quay.io/openshift-release-dev/ocp-release-nightly:iib-int-index-art-operators-4.17`
index image, which picks latest nightly builds from Brew.

Signed-off-by: Andrea Panattoni <[email protected]>
@zeeke
Copy link
Member Author

zeeke commented Oct 23, 2024

/test e2e-telco5g-cnftests

Copy link
Contributor

openshift-ci bot commented Oct 23, 2024

@zeeke: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-telco5g-cnftests a3faba8 link false /test e2e-telco5g-cnftests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@zeeke
Copy link
Member Author

zeeke commented Oct 23, 2024

Failing tests

CNF Features e2e integration tests: [It] [bondcni] bond over macvlan should be able to create pod with bond interface over macvlan interfaces expand_more	1m4s
CNF Features e2e integration tests: [It] [tuningcni] tuningcni over macvlan should be able to create pod with sysctls over macvlan expand_more

are known bugs. See https://issues.redhat.com/browse/CNF-14296

@fedepaol
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 23, 2024
@zeeke
Copy link
Member Author

zeeke commented Oct 23, 2024

/override ci/prow/e2e-aws-ci-tests

Copy link
Contributor

openshift-ci bot commented Oct 23, 2024

@zeeke: Overrode contexts on behalf of zeeke: ci/prow/e2e-aws-ci-tests

In response to this:

/override ci/prow/e2e-aws-ci-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 872a4ce into openshift-kni:master Oct 23, 2024
6 of 7 checks passed
zeeke added a commit to zeeke/release that referenced this pull request Oct 24, 2024
Run `sriov/knmstate` combined tests defined in
- openshift-kni/cnf-features-deploy#1990

as part of the cnf-tests suite.

Signed-off-by: Andrea Panattoni <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants