Fix SIGTERM can't terminate PubSub::listen issue by add cancellation token support.#606
Conversation
|
|
||
| void SignalHandlerHelper::RegisterSignalHandler(int signalNumber) | ||
| { | ||
| signal(signalNumber, SignalHandlerHelper::OnSignal); |
There was a problem hiding this comment.
Does it override the signal handler defined by the application? In case of python, does it mean that the user defined signal handler in python won't be executed?
There was a problem hiding this comment.
Yes, because linux can only have 1 signal handler per-application.
If customer need define their signal handler, then customer also need cancel the token when signal happen in their handler, this class just a helper class.
Did you change old behavior? In reply to: 1106076032 Refers to: common/select.cpp:166 in 037283c. [](commit_id = 037283c, deletion_comment = False) |
For non terminating signals (like siguser), original behavior is keeping while loop, but the new behavior is return SIGNALINTR. I think this is unexpected. In reply to: 1106076032 Refers to: common/select.cpp:166 in 037283c. [](commit_id = 037283c, deletion_comment = False) |
Will discussion offline, the reason is:
However. there are some UT failed, seems related with this change, will check reason. Update: in latest code, only sigint can break the while loop, all other signal will not break the while loop. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Add SignalHandlerHelper class back because:
So we have to add this class back. Also here is demo code in python3:
|
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Code coverage does not reach target, will add UT to cover new code. |
…ystemd (#2133)" (#2161) This reverts commit 23e9398. - What I did Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" This reverted PR is part of a story that refactors warm/fast shutdown sequence to gracefully stop services instead of killing them without any ordering and dependency requirements which creates several issues and is error prone for the future. This PR must come together with sonic-net/sonic-buildimage#10510. However, #10510 is blocked due to an issue in swss-common sonic-net/sonic-swss-common#603 And a fix by MSFT is in review sonic-net/sonic-swss-common#606 I am reverting it because its dependency is still blocked and we cannot update submodule pointer. Once the dependency of the reverted PR is resolved, it shall be re-committed.
…ystemd (#2133)" (#2166) - What I did This reverts commit a5f55aa. Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" This reverted PR is part of a story that refactors warm/fast shutdown sequence to gracefully stop services instead of killing them without any ordering and dependency requirements which creates several issues and is error prone for the future. This PR must come together with sonic-net/sonic-buildimage#10510. However, #10510 is blocked due to an issue in swss-common sonic-net/sonic-swss-common#603 And a fix by MSFT is in review sonic-net/sonic-swss-common#606 I am reverting it because its dependency is still blocked and we cannot update submodule pointer. Once the dependency of the reverted PR is resolved, it shall be re-committed. - How I did it git revert a5f55aa - How to verify it Run tests
|
@stepanblyschak, could you please check this PR again? all issue fixed. |
Could you fix for non terminating signals (like siguser)? In reply to: 1109033970 Refers to: common/select.cpp:166 in 037283c. [](commit_id = 037283c, deletion_comment = False) |
This already fixed, currently only sigint can break the while loop in select.cpp, all other signal will not break while loop. |
…token support. (sonic-net#606) Why I did it There are infinite loops inside PubSub::listen() method, so application using this method can't handle SIGTERM correctly. sonic-net#603 How I did it Add following class: 1. CancellationToken: this class will help exist the infinite loops when SIGTERM or other signal happen. 2. SignalHandlerHelper: Provide a native signal handler. How to verify it 1. manually test. 2. Pass all test case.
…ystemd (#2133)" (#2161) This reverts commit 23e9398. - What I did Revert "[scripts/fast-reboot] Shutdown remaining containers through systemd (#2133)" This reverted PR is part of a story that refactors warm/fast shutdown sequence to gracefully stop services instead of killing them without any ordering and dependency requirements which creates several issues and is error prone for the future. This PR must come together with sonic-net/sonic-buildimage#10510. However, #10510 is blocked due to an issue in swss-common sonic-net/sonic-swss-common#603 And a fix by MSFT is in review sonic-net/sonic-swss-common#606 I am reverting it because its dependency is still blocked and we cannot update submodule pointer. Once the dependency of the reverted PR is resolved, it shall be re-committed.
Why I did it
How I did it
How to verify it
Which release branch to backport (provide reason below if selected)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)