Skip to content

Conversation

@csegarragonz
Copy link
Collaborator

@csegarragonz csegarragonz commented Mar 18, 2024

turns out that #827 uncovered a bug in our migration code.

in summary, when migrating ranks to a completely new host the check for an existing world from rank 0 would fail, triggering a new world "creation".

my bet is that the test slurp included in #827 worked was because we migrated other ranks before rank 0. the error appeared with larger core counts.

@csegarragonz csegarragonz marked this pull request as ready for review March 18, 2024 17:11
@csegarragonz csegarragonz merged commit 86bdbf5 into main Mar 19, 2024
@csegarragonz csegarragonz deleted the migration-fix branch March 19, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants