Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-10974] Add bandwidth QoS controls #9881

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

coutinhop
Copy link
Member

Add bandwidth limiting for ingress and egress, implemented via tc qdisc configuration based on the {ingress,egress} bandwidth and burst fields of the QoSControls struct in WorkloadEndpoint. In kubernetes, they come from annotations on the kubernetes pods.

Felix FV tests added to verify that the tc qdisc configuration is properly added and that the bandwidth is properly limited with iperf3.

Some code borrowed from the kubernetes bandwidth plugin (credit given).

Description

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

TBD

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@coutinhop coutinhop self-assigned this Feb 21, 2025
@coutinhop coutinhop requested a review from a team as a code owner February 21, 2025 01:41
@marvin-tigera marvin-tigera added this to the Calico v3.30.0 milestone Feb 21, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Feb 21, 2025
@coutinhop coutinhop force-pushed the pedro-CORE-10974 branch 2 times, most recently from 3e04ce5 to 60759ea Compare February 21, 2025 02:11
Add bandwidth limiting for ingress and egress, implemented via `tc qdisc`
configuration based on the {ingress,egress} bandwidth and burst fields of
the QoSControls struct in WorkloadEndpoint. In kubernetes, they come from
annotations on the kubernetes pods.

Felix FV tests added to verify that the `tc qdisc` configuration is
properly added and that the bandwidth is properly limited with `iperf3`.

Some code borrowed from the kubernetes bandwidth plugin (credit given).
Copy link
Member

@nelljerram nelljerram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and corrections...


err := netlink.LinkAdd(&netlink.Ifb{
LinkAttrs: linkAttrs,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised there's nothing here to say what type of link this is. (i.e. the TYPE value in man ip-link.) Does it default to dummy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure I fully follow. The type is indicated by using the netlink.Ifb{} struct, the netlink code for it is:

// Ifb links are advanced dummy devices for packet filtering
type Ifb struct {
	LinkAttrs
}

func (ifb *Ifb) Attrs() *LinkAttrs {
	return &ifb.LinkAttrs
}

func (ifb *Ifb) Type() string {
	return "ifb"
}

Or am I missing something else? Apologies if I am

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I missed that b. That sounds fine.

func CreateIngressQdisc(rateInBits, burstInBits uint64, hostDeviceName string) error {
hostDevice, err := netlink.LinkByName(hostDeviceName)
if err != nil {
return fmt.Errorf("get host device: %s", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will likely be helpful to include hostDeviceName in this error message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(maybe above in CreateIfb too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, please can you consider this everywhere you've added new errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, thanks!

return fmt.Errorf("get host device: %s", err)
}

// check if host device has a ingres qdisc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp: ingress

return fmt.Errorf("create ifb qdisc: %s", err)
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there so much asymmetry between the ingress and egress coding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, we can only add one tbf qdisc per interface, and it only works for ingress. So for egress, the workaround is to create an ingress qdisc, an ifb interface, mirror the egress traffic from the original interface to the ifb, and then add the tbf qdisc to the ifb interface...

// tc qdisc add dev link root tbf
// rate netConf.BandwidthLimits.Rate
// burst netConf.BandwidthLimits.Burst
if rateInBits <= 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vars are unsigned, so I think it would be clearer to write == 0 here.

Expect(err).NotTo(HaveOccurred())
log.Infof("iperf client rate with no limit (bps): %v", baselineRate)
// Expect the baseline rate to be much greater than the desired limit
Expect(baselineRate).To(BeNumerically(">=", 100000.0*100))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the *100 factor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the test locally on my machine, I was getting roughly 35Gbps traffic when not limiting. I just figured this would be a sanity check to make sure the "fixed" 100kbps used when enforcing the bandwidth limits is not too close to the non-limited bandwidth. (I figured this might vary depending on the conditions when running the test)

// ingress config should be present
Eventually(getQdisc, "10s", "1s").Should(MatchRegexp(`qdisc tbf \d+: dev ` + regexp.QuoteMeta(w[1].InterfaceName) + ` root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit")))
// egress config should not be present
Eventually(getQdisc, "10s", "1s").ShouldNot(MatchRegexp(`qdisc tbf \d+: dev bwcali.* root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason for Eventually here. Should probably be just Expect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does need Eventually, as it takes a quantum of time for the tc qdisc configuration to appear once the WorkloadEndpoint is updated. (granted, I didn't spend much time fine tuning this, 10s is probably too much)


out, err = w[1].ExecOutput("iperf3", "-c", w[0].IP, "-O5", "-J")
Expect(err).NotTo(HaveOccurred())
egressLimitedRate, err := getRateFromJsonOutput(out)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ingress-limited, not egress-

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is embarrassing, my ingress and egress var names were switched, thanks for catching it!

Expect(err).NotTo(HaveOccurred())
log.Infof("iperf client rate with ingress limit (bps): %v", egressLimitedRate)
// Expect the ingress limited rate to be much greater than the desired limit, but smaller than baseline (with a 20% margin)
Expect(egressLimitedRate).To(BeNumerically(">=", 100000.0*100))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic of this check. Why "Expect the ingress limited rate to be much greater than the desired limit"? What is "the desired limit"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this was a mistake, I'll update the test to run iperf3 in the correct direction and have the proper expectation

// ingress config should not be present
Eventually(getQdisc, "10s", "1s").ShouldNot(MatchRegexp(`qdisc tbf \d+: dev ` + regexp.QuoteMeta(w[1].InterfaceName) + ` root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit")))
// egress config should be present
Eventually(getQdisc, "10s", "1s").Should(And(MatchRegexp(`qdisc ingress ffff: dev `+regexp.QuoteMeta(w[1].InterfaceName)+` parent ffff:fff1`), MatchRegexp(`qdisc tbf \d+: dev bwcali.* root refcnt \d+ rate `+regexp.QuoteMeta("100Kbit"))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

- Improve error messages
- Fix ingress bandwidth testing
- Other misc corrections
coutinhop added a commit to coutinhop/operator that referenced this pull request Feb 22, 2025
It will be replaced by felix functionality. Needs to merge in sync with
projectcalico/calico#9881.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants