-
Notifications
You must be signed in to change notification settings - Fork 126
Closed
Description
Background information
What version of the PMIx Reference Library are you using?
5.0.8
Describe how PMIx was installed
RPM
Please describe the system on which you are running
- Operating system/version: RHEL 9.6
- Computer hardware: VM
- Network type: N/A
Details of the problem
We have a test which uses PMIx_tool_init to attach to a currently running PMIx application, then gathers information from it with PMIx_Get. The test works fine on SLES 15 SP6, but fails on RHEL 9.6, with the following error:
PMIx_Get pmix.lpeers failed: -46
The PMIx library and our code were both built from the same source. I'm wondering what could explain the different behavior between SLES and RHEL.
With some PMIx debugging enabled, I'm seeing this output on SLES (working):
[sles15sp6:18719] [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0] pmix:gds:hash fetch pmix.lpeers for proc [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),UNDEF] on scope UNDEFINED on behalf of [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0]
[sles15sp6:18719] FETCHING NODE INFO WITH KEY pmix.lpeers
[sles15sp6:18719] [server/pmix_server_get.c:797] GDS FETCH KV WITH hash
[sles15sp6:18719] [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0] pmix:gds:hash fetch NULL for proc [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),WILDCARD] on scope STORE INTERNALLY on behalf of [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0]
[sles15sp6:18719] [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0] HASH:FETCH table internal id WILDCARD key NULL
[sles15sp6:18719] [pals.pmix.f1323cf0-4a01-4d2f-bc6a-1ca67964c5a6.(null),0] FETCH NULL LOOKING AT PMIX_SPAWNED
And RHEL (failing)
[rhel96.novalocal:220463] [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0] pmix:gds:hash fetch pmix.lpeers for proc [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),UNDEF] on scope UNDEFINED on behalf of [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0]
[rhel96.novalocal:220463] FETCHING NODE INFO WITH KEY pmix.lpeers
[rhel96.novalocal:220463] [server/pmix_server_get.c:514] GDS FETCH KV WITH hash
[rhel96.novalocal:220463] [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0] pmix:gds:hash fetch pmix.lpeers for proc [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),UNDEF] on scope UNDEFINED on behalf of [pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null),0]
[rhel96.novalocal:220463] FETCHING NODE INFO WITH KEY pmix.lpeers
[rhel96.novalocal:220463] [server/pmix_server.c:5041] PACK version v41 type PMIX_STATUS
[rhel96.novalocal:220463] [server/pmix_server.c:5045] queue callback called: reply to pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null):0 on tag 109 size 2
[rhel96.novalocal:220463] pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null):0 CALLBACK COMPLETE
[rhel96.novalocal:220463] [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0] ptl:base:send_handler SENDING TO PEER [pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null),0] tag 109 with NON-NULL msg
[rhel96.novalocal:220463] ptl:base:send_handler SENDING MSG TO [pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null),0] TAG 109
[rhel96.novalocal:220463] ptl:base:send_handler MSG SENT
[rhel96.novalocal:220463] [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0] ptl:base:recv:handler called with peer pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null):0
[rhel96.novalocal:220463] ptl:base:recv:handler allocate new recv msg
[rhel96.novalocal:220463] ptl:base:recv:handler read hdr on socket 31
[rhel96.novalocal:220463] [pals.pmix.495f081f-4ae0-43dc-b51b-4dbd54088981.(null),0] ptl:base:msg_recv: peer [pals.pmix.fbdc3b9a-b281-43bd-9946-3f23cdb2f5b0.(null),0] closed connection
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels