TL;DR: Two different setups of Omicron, both sharing the same network, can share prefixes via ddm and "try to induct the other" into a view of the rack.
As mentioned in #1639, it seems that two separate instantiations of RSS can try to communicate with an overlapping set of sleds, and that may not be desired.
Here's a timeline:
- Sled 1 starts RSS, is configured to boot with a rack secret threshold of "1". Sled agent comes online.
- Sled 2 starts RSS, is configured to boot with a rack secret threshold of "1".
- Sled 2 sees sled 1, advertised by ddmd. This RSS creates a plan that includes both sled 1 and 2 in the view of the "rack".
- Sled 2's RSS sends a request to "start sled agent" to Sled 1. This request fails, because sled 1 already booted the sled agent by itself. (this was a very real error, seen here - thank you @iliana for noticing!).
Some ideas for mitigating:
TL;DR: Two different setups of Omicron, both sharing the same network, can share prefixes via ddm and "try to induct the other" into a view of the rack.
As mentioned in #1639, it seems that two separate instantiations of RSS can try to communicate with an overlapping set of sleds, and that may not be desired.
Here's a timeline:
Some ideas for mitigating:
foo, but you're asking me to join rackbar.