Nvidia/Mellanox expose ROCE ECN information on sysfs on the path#695
Nvidia/Mellanox expose ROCE ECN information on sysfs on the path#695SuperQ merged 1 commit intoprometheus:masterfrom
Conversation
/sys/class/net/<interface>/ecn/<protocol>/ There are 2 protocols Reaction Point (rp) and Notification point (np) For each of the protocols they have a list of attributes: /sys/class/net/<interface>/ecn/<protocol>/params/<requested attribute> Each protocol will also if ECN is enabled per priority (where X is the priority): /sys/class/net/<interface>/ecn/<protocol>/enable/X This is documented here https://docs.nvidia.com/networking/display/mlnxofedv571020/explicit+congestion+notification+(ecn) The attributes are documented here: https://enterprise-support.nvidia.com/s/article/dcqcn-parameters Signed-off-by: Diego Asturias <dasturias@arista.com>
d3d9716 to
cda72fe
Compare
|
LGTM in general but it makes me wonder where to draw the line what vendor specific stuff to include here and what not.. I feel this is probably relevant enough to be included but not sure. @SuperQ wdyt? |
|
Just to be clear this is just nvidia specific because other vendors aren't exposing these values through Sysfs. these are pretty generic values for ROCEv2. For example https://docs.broadcom.com/doc/NCC-WP1XX has information on broadcom's congestion control, which also uses ECN. Ideally other vendors would also expose these values in a sysfs path instead of relying on propriety utilities. I'm not aware if other vendors that implement rocev2 plan to in the future integrate with sysfs (if they already have), but this seems fairly generic in terms of ROCEv2 |
|
@dasturiasArista Thanks for clarifying. I think this is a good argument for including it here. |
Nvidia/Mellanox expose ROCE ECN information on sysfs on the path
/sys/class/net/<interface>/ecn/<protocol>/
There are 2 protocols Reaction Point (rp) and Notification point (np)
For each of the protocols they have a list of attributes:
/sys/class/net/<interface>/ecn/<protocol>/params/<requested attribute>
Each protocol will also if ECN is enabled per priority (where X is the
priority):
/sys/class/net/<interface>/ecn/<protocol>/enable/X
This is documented here
https://docs.nvidia.com/networking/display/mlnxofedv571020/explicit+congestion+notification+(ecn)
The attributes are documented here:
https://enterprise-support.nvidia.com/s/article/dcqcn-parameters