HLD for changing teamd expiry timer#1073
Conversation
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
|
|
||
| ## Requirements | ||
|
|
||
| - Switch running a supported SONiC with patches in libteam for this feature on |
There was a problem hiding this comment.
Will this feature (patch) stay as patch in libteam specifically for SONiC or will be pushed to libteam community as well?
There was a problem hiding this comment.
Because this is effectively breaking the LACP protocol, I'm not planning on submitting this patch upstream.
| # Protocol | ||
|
|
||
| To change the number of retries, an Ethernet packet of the fillowing structure | ||
| will be sent. This Ethernet packet will have an ethertype of 0x6300, and will |
There was a problem hiding this comment.
Any reason for ethertype 0x6300 selection?
There was a problem hiding this comment.
This is just meant to be a custom ethertype that appears to be unused. I needed something that was unused and is unlikely to get treated like a "normal" data packet.
|
|
||
| # CLI | ||
|
|
||
| No new CLI options or config options will be added, as this is not meant to be |
There was a problem hiding this comment.
Config knob may be required to avoid sending unnecessary 0x6300 packets during warm-reboot when SONiC is connected to Non-SONiC device.
There was a problem hiding this comment.
I would like to see a configurable option to enable this feature, default should be left disabled with no custom TLV to interfere with WB. The HLD diverged from standard LACP protocol definition, using custom TLV to overcome SONiC WB timing, and potentially a phase2 send and ack mechanism in the future that could potentially block WB from proceeding. For deployment where non standard protocol packets are forbidden, we need a configurable option to control this behavior. Preferably leaving it disable by default.
There was a problem hiding this comment.
The HLD has been updated with a new design/approach. By default, this feature is disabled, so there won't be any custom packets unless configured.
|
|
||
| Now, in addition to refreshing the PDUs timer, the above-specified Ethernet | ||
| packet (with ethertype 0x6300) will be sent to the peer devices, with the new | ||
| retry count set to 5. This notifies the peer device that for this device that |
There was a problem hiding this comment.
Why cant we make this retry-count as user configurable with default value 5?
There was a problem hiding this comment.
The HLD has been updated with a new design/approach. The retry count is now user-configurable.
| with ethertype 0x6300, and the data will contain the Actor Information, Partner | ||
| Information, and Retry Count TLVs. The receiving device must validate the actor | ||
| and partner information, and then update the retry count as specified. No | ||
| acknowledgment packet is sent back. |
There was a problem hiding this comment.
How would you ensure the packet reached the peer without ack?
There was a problem hiding this comment.
Ack will be added into the protocol.
|
|
||
| | Value | Description | | ||
| |-------|---------------------| | ||
| | 0x01 | Actor Information | |
There was a problem hiding this comment.
How about sending this special packet at LAG level, instead of sending on every lag member which can be heavy if the lag size is significantly high; for example 64 member lag
|
@saiarcot895 can you please help to add the code PRs into this HLD by referring to #806 ? Thanks. |
|
@saiarcot895 can you please add the code PRs by referring to #806 ? Thanks. |
|
No code has been committed yet for this. There are changes to the design of this feature, and this HLD needs to be updated. I'm moving this PR to draft status. |
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
|
Recording for today's community review https://zoom.us/rec/share/9xfUbBRqllA9BpWfc3UN51f0Q6067KPXtpsLD9owQrUiRPtIpMjEaXpEDmDi8cTc.XSBxLjWF9FtvCjIM |
|
added to 202305 release |
|
move to post 202305 release. code PRs are not ready |
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
This PR adds a HLD for changing the duration of teamd's expiry timer, by sending a message to the peer device with the number of retries it should do for this LAG.
Code PRs: