-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-10974] Add bandwidth QoS controls #9881
base: master
Are you sure you want to change the base?
Conversation
3e04ce5
to
60759ea
Compare
Add bandwidth limiting for ingress and egress, implemented via `tc qdisc` configuration based on the {ingress,egress} bandwidth and burst fields of the QoSControls struct in WorkloadEndpoint. In kubernetes, they come from annotations on the kubernetes pods. Felix FV tests added to verify that the `tc qdisc` configuration is properly added and that the bandwidth is properly limited with `iperf3`. Some code borrowed from the kubernetes bandwidth plugin (credit given).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and corrections...
|
||
err := netlink.LinkAdd(&netlink.Ifb{ | ||
LinkAttrs: linkAttrs, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised there's nothing here to say what type of link this is. (i.e. the TYPE
value in man ip-link
.) Does it default to dummy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure I fully follow. The type is indicated by using the netlink.Ifb{}
struct, the netlink code for it is:
// Ifb links are advanced dummy devices for packet filtering
type Ifb struct {
LinkAttrs
}
func (ifb *Ifb) Attrs() *LinkAttrs {
return &ifb.LinkAttrs
}
func (ifb *Ifb) Type() string {
return "ifb"
}
Or am I missing something else? Apologies if I am
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I missed that b
. That sounds fine.
func CreateIngressQdisc(rateInBits, burstInBits uint64, hostDeviceName string) error { | ||
hostDevice, err := netlink.LinkByName(hostDeviceName) | ||
if err != nil { | ||
return fmt.Errorf("get host device: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will likely be helpful to include hostDeviceName
in this error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(maybe above in CreateIfb too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, please can you consider this everywhere you've added new errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks!
return fmt.Errorf("get host device: %s", err) | ||
} | ||
|
||
// check if host device has a ingres qdisc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sp: ingress
return fmt.Errorf("create ifb qdisc: %s", err) | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there so much asymmetry between the ingress and egress coding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, we can only add one tbf
qdisc per interface, and it only works for ingress. So for egress, the workaround is to create an ingress
qdisc, an ifb
interface, mirror the egress traffic from the original interface to the ifb
, and then add the tbf
qdisc to the ifb
interface...
// tc qdisc add dev link root tbf | ||
// rate netConf.BandwidthLimits.Rate | ||
// burst netConf.BandwidthLimits.Burst | ||
if rateInBits <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vars are unsigned, so I think it would be clearer to write == 0
here.
Expect(err).NotTo(HaveOccurred()) | ||
log.Infof("iperf client rate with no limit (bps): %v", baselineRate) | ||
// Expect the baseline rate to be much greater than the desired limit | ||
Expect(baselineRate).To(BeNumerically(">=", 100000.0*100)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the *100
factor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running the test locally on my machine, I was getting roughly 35Gbps traffic when not limiting. I just figured this would be a sanity check to make sure the "fixed" 100kbps used when enforcing the bandwidth limits is not too close to the non-limited bandwidth. (I figured this might vary depending on the conditions when running the test)
// ingress config should be present | ||
Eventually(getQdisc, "10s", "1s").Should(MatchRegexp(`qdisc tbf \d+: dev ` + regexp.QuoteMeta(w[1].InterfaceName) + ` root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit"))) | ||
// egress config should not be present | ||
Eventually(getQdisc, "10s", "1s").ShouldNot(MatchRegexp(`qdisc tbf \d+: dev bwcali.* root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason for Eventually here. Should probably be just Expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does need Eventually
, as it takes a quantum of time for the tc qdisc
configuration to appear once the WorkloadEndpoint is updated. (granted, I didn't spend much time fine tuning this, 10s is probably too much)
felix/fv/qos_controls_test.go
Outdated
|
||
out, err = w[1].ExecOutput("iperf3", "-c", w[0].IP, "-O5", "-J") | ||
Expect(err).NotTo(HaveOccurred()) | ||
egressLimitedRate, err := getRateFromJsonOutput(out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ingress-limited, not egress-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is embarrassing, my ingress and egress var names were switched, thanks for catching it!
felix/fv/qos_controls_test.go
Outdated
Expect(err).NotTo(HaveOccurred()) | ||
log.Infof("iperf client rate with ingress limit (bps): %v", egressLimitedRate) | ||
// Expect the ingress limited rate to be much greater than the desired limit, but smaller than baseline (with a 20% margin) | ||
Expect(egressLimitedRate).To(BeNumerically(">=", 100000.0*100)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the logic of this check. Why "Expect the ingress limited rate to be much greater than the desired limit"? What is "the desired limit"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this was a mistake, I'll update the test to run iperf3
in the correct direction and have the proper expectation
// ingress config should not be present | ||
Eventually(getQdisc, "10s", "1s").ShouldNot(MatchRegexp(`qdisc tbf \d+: dev ` + regexp.QuoteMeta(w[1].InterfaceName) + ` root refcnt \d+ rate ` + regexp.QuoteMeta("100Kbit"))) | ||
// egress config should be present | ||
Eventually(getQdisc, "10s", "1s").Should(And(MatchRegexp(`qdisc ingress ffff: dev `+regexp.QuoteMeta(w[1].InterfaceName)+` parent ffff:fff1`), MatchRegexp(`qdisc tbf \d+: dev bwcali.* root refcnt \d+ rate `+regexp.QuoteMeta("100Kbit")))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
- Improve error messages - Fix ingress bandwidth testing - Other misc corrections
It will be replaced by felix functionality. Needs to merge in sync with projectcalico/calico#9881.
Add bandwidth limiting for ingress and egress, implemented via
tc qdisc
configuration based on the {ingress,egress} bandwidth and burst fields of the QoSControls struct in WorkloadEndpoint. In kubernetes, they come from annotations on the kubernetes pods.Felix FV tests added to verify that the
tc qdisc
configuration is properly added and that the bandwidth is properly limited withiperf3
.Some code borrowed from the kubernetes bandwidth plugin (credit given).
Description
Related issues/PRs
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.