-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is ConnectX-6 supported? #19
Comments
@Eoghan1232 I think right now it's not supported but it should be with the net implementation you are working on right? |
Hey @jslouisyou, The current version does not officially support Mellanox InfiniBand interfaces, though with the Netlink collector enabled and your device class change, it might work. We are planning to support Mellanox cards with the latest Mellanox EN and OFED drivers. |
Thanks @eoghanlawless |
We have a few changes coming soon, but haven't looked at implementing Mellanox InfiniBand support just yet. The next release should be in the new year, and the following release should include InfiniBand support. |
@eoghanlawless Thanks. |
Hi @jslouisyou - 1.0 version focusses on common functionality across vendors - and prioritizes Ethernet as the common use case - Mellanox bespoke stats for InfiniBand are outside the current scope. Intel do not currently support InfiniBand, and have no way to validate it's functionality. Metrics provides an extendable interface for others to contribute, which could include InfiniBand support. |
Dear,
I'm currently using Mellanox ConnectX-6 Adapter (HPE InfiniBand HDR/Ethernet 200Gb 2-port QSFP56 PCIe4 x16 MCX653106A-HDAT Adapter) and trying to using sriov-network-metrics-exporter in Kubernetes cluster, but any sriov-network-metrics-exporter PODs can't get metrics from Infiniband Physical & Virtual Function when I tried to
k exec -it -n monitoring sriov-metrics-exporter-lj8rq -- wget -O- localhost:9808/metrics
(sriov-metrics-exporter-lj8rq is deployed exporter in Kubernetes cluster).BTW,
I digged into some codes in
collectors/sriovdev.go
andnetClass
is only defined for0x020000
.It seems that, in case of ConnectX-6,
0x020000
is only for ethernet adapter and0x020700
is for Infiniband adapter.Here's my environment as below - ibs* is for Infiniband adapter and ens* is for ethernet adapter.
and ens* has
0x020000
class when I checked in below.but All Infiniband adapter has
0x020700
class accordingly,so I changed
netClass
from0x020000
to0x020700
and then sriov-network-metrics-exporter can find all IB PFs and VFs.Before change it, sriov-metrics-exporter POD is showing only ethernet adapters are caught;
After change it, sriov-metrics-exporter POD can catch IB adapter;
But all metrics except sriov_kubepoddevice show 0 in prometheus, even if I attach all VFs in each 2 POD and ran
ib_send_bw
between them.I think these PFs are not recognized in current
pkg/vfstats/netlink.go
.So, is ConnectX-6 supported in current sriov-network-metrics-exporter? and if not, is there any plan for supporting ConnectX-6 later?
Thanks!
The text was updated successfully, but these errors were encountered: