Skip to content

Multiple commits#3440

Merged
rhc54 merged 2 commits intoopenpmix:v5.0from
rhc54:cmr50/up
Nov 12, 2024
Merged

Multiple commits#3440
rhc54 merged 2 commits intoopenpmix:v5.0from
rhc54:cmr50/up

Conversation

@rhc54
Copy link
Contributor

@rhc54 rhc54 commented Nov 12, 2024

Fix resolve peers for other nspaces

When calling PMIx_Resolve_peers, the client may not be
able to find the job-level info for the specified nspace.
In this case, it requests the info from the server. The
server's copy of the info is sometimes stored as rank=UNDEF,
but can also be stored as rank=WILDCARD (depending on the
server's version). So the client has to check both rank
values to ensure the data is found.

Fortunately, it only requires one exchange with the server
regardless of which rank is being used.

Signed-off-by: Ralph Castain rhc@pmix.org
(cherry picked from commit c9efd8d)

Minor cleanups plus resolve peers example

Some minor code cleanups, mostly in debug statements. Add
an example of how to use the "resolve" APIs.

IMPORTANT NOTE: when you spawn a child job and want to
collect job-level info on it, you MUST use PMIx_Connect
to share the info across parent and child jobs. Using
PMIx_Fence - even with the "collect data" flag - only
shares the data that each process "put" - it does not
include job-level data.

Signed-off-by: Ralph Castain rhc@pmix.org
(cherry picked from commit 39adf49)

When calling PMIx_Resolve_peers, the client may not be
able to find the job-level info for the specified nspace.
In this case, it requests the info from the server. The
server's copy of the info is sometimes stored as rank=UNDEF,
but can also be stored as rank=WILDCARD (depending on the
server's version). So the client has to check both rank
values to ensure the data is found.

Fortunately, it only requires one exchange with the server
regardless of which rank is being used.

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit c9efd8d)
Some minor code cleanups, mostly in debug statements. Add
an example of how to use the "resolve" APIs.

IMPORTANT NOTE: when you spawn a child job and want to
collect job-level info on it, you MUST use PMIx_Connect
to share the info across parent and child jobs. Using
PMIx_Fence - even with the "collect data" flag - only
shares the data that each process "put" - it does _not_
include job-level data.

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 39adf49)
@rhc54 rhc54 merged commit 73f30cd into openpmix:v5.0 Nov 12, 2024
@rhc54 rhc54 deleted the cmr50/up branch November 12, 2024 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant