Remove redundant calls to get child scheduler group during initialization#1965
Conversation
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…tion The QoS orchagent call SAI APIs to get the number of child scheduler groups and then initialize them. After that, the size of child scheduler groups vector will be non-zero, which indicates the child scheduler groups have been initialized and prevent QoS orchagent from calling SAI APIs. However, on Mellanox platform, some scheduler groups don't have child group, leaving size of child scheduler groups always being zero. This causes QoS orchagent to call the SAI API each time the scheduler group is handled, which wastes a lot of time especially during fast reboot. An extra flag indicating whether the child groups have been initialized is introduced to avoid the redundant calls. Signed-off-by: Stephen Sun <stephens@nvidia.com>
80c8f7e to
70478e5
Compare
|
Arm build failed due to the following error, which looks like not related to change in the PR |
|
@prsunny can you please help with the swss failures? @stephenxs please add how much time it takes for qosorch to finish the same work with this fix and without it. |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
1 similar comment
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| SWSS_LOG_INFO("Check child group %lx for port %s group %lx", child_group_id, port.m_alias.c_str(), group_id); | ||
| if (child_group_id == queue_id) | ||
| { | ||
| SWSS_LOG_INFO("Found group id %lx for port %s queue %lx", group_id, port.m_alias.c_str(), queue_id); |
There was a problem hiding this comment.
There are too many logs added to code. Could you please revisit the logs? Also the compilation seems to be failing for armhf build. Please use the object ids as PRIx64. Refer existing logs
There was a problem hiding this comment.
Removed all the logs which were added but not related to a code change in this PR. For the rest, I tend to keep them.
Thank you for the comments on ARM building failure. Also fixed it.
There was a problem hiding this comment.
@prsunny can you please review it? comments fixed. thanks.
- Fix compiling errors in ARM - Remove some redundant log message Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
…tion (#1965) - What I did The QoS orchagent calls SAI APIs to get the number of child scheduler groups and then initialize them. After that, the size of the child scheduler groups vector will be non-zero, which indicates the child scheduler groups have been initialized and prevents QoS orchagent from calling SAI get APIs again. However, on some platforms, it may be that some of the scheduler groups don't have child groups, leaving size of child scheduler groups always being zero. It causes QoS orchagent to call the SAI get API each time the scheduler group is handled, which wastes a lot of time, especially during fast reboot. An extra flag indicating whether the child groups have been initialized is introduced to avoid redundant calls. - Why I did it To optimize QoS orchagent performance during initialization especially for fast reboot. - How I verified it It can be covered by the existing regression test. - Details if related I did a pair of tests, comparing the time to handle scheduler and child scheduler groups between the old and the new (optimized) version. Dump and sairedis.record attached. Signed-off-by: Stephen Sun <stephens@nvidia.com>
…tion (#1965) - What I did The QoS orchagent calls SAI APIs to get the number of child scheduler groups and then initialize them. After that, the size of the child scheduler groups vector will be non-zero, which indicates the child scheduler groups have been initialized and prevents QoS orchagent from calling SAI get APIs again. However, on some platforms, it may be that some of the scheduler groups don't have child groups, leaving size of child scheduler groups always being zero. It causes QoS orchagent to call the SAI get API each time the scheduler group is handled, which wastes a lot of time, especially during fast reboot. An extra flag indicating whether the child groups have been initialized is introduced to avoid redundant calls. - Why I did it To optimize QoS orchagent performance during initialization especially for fast reboot. - How I verified it It can be covered by the existing regression test. - Details if related I did a pair of tests, comparing the time to handle scheduler and child scheduler groups between the old and the new (optimized) version. Dump and sairedis.record attached. Signed-off-by: Stephen Sun <stephens@nvidia.com>
What I did Add CLI for configuring buffer profiles for queues How to verify it Run unit test to verify the logic New command output (if the output of a command-line utility has changed) config interface buffer queue add config interface buffer queue set config interface buffer queue remove Signed-off-by: Stephen Sun <stephens@nvidia.com>
What I did
The QoS orchagent calls SAI APIs to get the number of child scheduler groups and then initialize them.
After that, the size of the child scheduler groups vector will be non-zero, which indicates the child scheduler groups have been initialized and prevents QoS orchagent from calling SAI get APIs again.
However, on some platforms, it may be that some of the scheduler groups don't have child groups, leaving size of child scheduler groups always being zero. It causes QoS orchagent to call the SAI get API each time the scheduler group is handled, which wastes a lot of time, especially during fast reboot.
An extra flag indicating whether the child groups have been initialized is introduced to avoid redundant calls.
Signed-off-by: Stephen Sun stephens@nvidia.com
Why I did it
To optimize QoS orchagent performance during initialization especially for fast reboot.
How I verified it
It can be covered by the existing regression test.
Details if related
I did a pair of tests, comparing the time to handle scheduler and child scheduler groups between the old and the new (optimized) version. Dump and sairedis.record attached.
sairedis.rec.old.log
sairedis.rec.new.log
sonic_dump_mtbc-sonic-03-2700_20211104_145113.tar.gz