Skip to content

Conversation

@exarkun
Copy link
Contributor

@exarkun exarkun commented Aug 11, 2022

until polls with fairly low resolution. If the supervisor managed to
receive exit notification and launch a new process between two consecutive
polls for is_running() == False the test would miss the stopped condition
entirely and never complete (or rather, time out after 10 seconds).

Now just wait for the supervisor to report the pid has changed and assert the
process ends in the desire state (running).

`until` polls with fairly low resolution.  If the supervisor managed to
receive exit notification and launch a new process between two consecutive
polls for `is_running() == False` the test would miss the stopped condition
entirely and never complete (or rather, time out after 10 seconds).

Now just wait for the supervisor to report the pid has changed and assert the
process ends in the desire state (running).
@crwood
Copy link
Member

crwood commented Aug 18, 2022

If the supervisor managed to receive exit notification and launch a new process between two consecutive
polls for is_running() == False the test would miss the stopped condition entirely and never complete (or rather, time out after 10 seconds).

The default restart_delay value of 1 is intended to guard against this (i.e., by enforcing a wait of restart_delay amount of seconds before re-launching), however, since that's merely a default -- and since there's no mechanism/check in place to ensure that the value of restart_delay is greater than the until polling resolution -- I'm fine with this change.

Now just wait for the supervisor to report the pid has changed and assert the process ends in the desire state (running).

I recall reading somewhere that, on Windows, PIDs can be recycled immediately (in which case this approach would not allow us to reliably determine whether the process has been re-started) however a) I can't seem to find an authoritative source for that claim now and b) having this test in place might help to us to demonstrate whether it's true -- so thanks!

@crwood crwood merged commit 28ecf7f into gridsync:master Aug 18, 2022
@exarkun exarkun deleted the more-reliable-supervisor-kill-test branch August 18, 2022 15:26
@exarkun
Copy link
Contributor Author

exarkun commented Aug 18, 2022

I recall reading somewhere that, on Windows, PIDs can be recycled immediately (in which case this approach would not allow us to reliably determine whether the process has been re-started) however a) I can't seem to find an authoritative source for that claim now and b) having this test in place might help to us to demonstrate whether it's true -- so thanks!

Ah, this is a good point. PID re-use is certainly a thing. I don't know how likely it is that a PID gets re-used in this particular scenario but maybe it's not impossible.

Sometimes folks use the combination of (pid, start time) as a more robust unique identifier. We could consider doing that kind of thing around here (maybe exposing this functionality behind another Supervisor API), especially if this new version of the test also proves to be unreliable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants