Skip to content

Conversation

@fanquake
Copy link
Member

os.killpg and os.getpgid are Unix only. Seen in at least #20744:

Runtime: 1 s

Traceback (most recent call last):
  File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_runner.py", line 797, in <module>
    main()
  File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_runner.py", line 446, in main
    run_tests(
  File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_runner.py", line 567, in run_tests
    os.killpg(os.getpgid(0), signal.SIGKILL)
AttributeError: module 'os' has no attribute 'killpg'

C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build>if 1 NEQ 0 exit /b 1 

@fanquake fanquake added the Tests label Sep 10, 2021
@hebasto
Copy link
Member

hebasto commented Sep 10, 2021

Just curious, why do functional tests pass in the current master (after #22922)?

UPDATE: nm, I see the reason.

@hebasto
Copy link
Member

hebasto commented Sep 10, 2021

Related: #22249 (review)

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to just not fix it. I mean who uses Windows anyway (to run the functional tests with --failfast).

At least with the error message, it well be clear that something went wrong. With this patch it will leave them dangling.

@S3RK
Copy link
Contributor

S3RK commented Sep 10, 2021

@fanquake I'm not familiar with how windows processes work. So what's the behaviour with this patch?

Will the tests continue to run and exit normally? Will the tests be killed by OS after the test runner finished? Or something else entirely?

@laanwj
Copy link
Member

laanwj commented Sep 16, 2021

I'm not familiar with how windows processes work. So what's the behaviour with this patch?

killpg sends a signal to a whole process group. windows doesnt have the concept of process groups, to do a similar thing i think you'd have to manually iterate the process tree

@fanquake
Copy link
Member Author

We can just leave this for now.

@fanquake fanquake closed this Sep 17, 2021
@ryanofsky
Copy link
Contributor

ryanofsky commented Sep 21, 2021

How should I handle a PR that seems to be failing CI because of this? Restart the job? Ignore it? Is this a race condition? I didn't read through all the related issues but I can't figure out from discussion above what seems to make this problem happen only sometimes or how to handle it. The failure is https://github.com/bitcoin/bitcoin/pull/22937/checks?check_run_id=3646048735 in #22937.

@maflcko
Copy link
Member

maflcko commented Sep 21, 2021

The error from https://cirrus-ci.com/task/4664491330764800?logs=functional_tests#L148 is

 node0 stderr Error: Specified data directory "C:\Users\ContainerAdministrator\AppData\Local\Temp\test_runner_₿_🏃_20210910_034136\rpc_help_4\node0" does not exist. 

@maflcko
Copy link
Member

maflcko commented Sep 21, 2021

The error from https://cirrus-ci.com/task/5711752030584832?logs=functional_tests#L0 is

OSError: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted

(You need to download the full log to see it)

@katesalazar
Copy link
Contributor

I mean who uses Windows anyway (to run the functional tests with --failfast).

That looks like a perfectly valid use case. I think #23085 is a legit issue.

Of course that doesn't mean that any "non Windows heads" at all
should pay much attention at it, unless for whatever reason wanting to.

Cheers!

@jamesob
Copy link
Contributor

jamesob commented Oct 21, 2021

I think this is worth merging if it would clarify underlying failures, e.g. in https://github.com/bitcoin/bitcoin/pull/23289/checks?check_run_id=3955413522 the presence of this error seems to be obscuring the actual issue (#23303).

@maflcko
Copy link
Member

maflcko commented Oct 21, 2021

It might be better to introduce a new option --fail-fast-dangling to clarify that this fails fast, but leaves everything dangling.

(Or an undocumented ENV var, if the overhead of a new option would be too much)

@katesalazar
Copy link
Contributor

an undocumented ENV var

that's great, simple, clean, traditional, and produces code easy to read

@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
@fanquake fanquake deleted the windows_no_killpg branch September 14, 2023 10:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants