Skip to content

Bug in work stealing scheduler policy causing idle workers #1046

@robgjansen

Description

@robgjansen

The work-stealing scheduling policy is occasionally causing some worker threads to be idle even though there is outstanding work to perform. This is item 1 in #1041, more details below.

There is a bug in the logic that is run at the beginning of each simulation round. When the round changes, each thread moves all of its assigned hosts from its processed queue back into the unprocessed queue so that the work associated with those hosts in the new round can proceed. There is a race condition here where thread A has nothing to do (no hosts assigned) and tries to start stealing a host from thread B's unprocessed queue before thread B has repopulated it for the new round. Thread A then thinks all the work is already done and sits idle, while thread B is stuck with a full queue of hosts that it must process.

Here is a simple program to create a ~5 second cpu workload (on my machine):

#include <stdint.h>
int main(int argc, char* argv[]) {
    volatile uint64_t i = 0;
    while(i < 3000000000)
        i += 1;
}

And the shadow config:

<shadow stoptime="2">
  <topology><![CDATA[<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
  <key attr.name="packetloss" attr.type="double" for="edge" id="d4" />
  <key attr.name="latency" attr.type="double" for="edge" id="d3" />
  <key attr.name="bandwidthup" attr.type="int" for="node" id="d2" />
  <key attr.name="bandwidthdown" attr.type="int" for="node" id="d1" />
  <key attr.name="countrycode" attr.type="string" for="node" id="d0" />
  <graph edgedefault="undirected">
    <node id="poi-1">
      <data key="d0">US</data>
      <data key="d1">10240</data>
      <data key="d2">10240</data>
    </node>
    <edge source="poi-1" target="poi-1">
      <data key="d3">1.0</data>
      <data key="d4">0.0</data>
    </edge>
  </graph>
</graphml>
]]></topology>
  <plugin id="spin" path="/home/rjansen/dev/shadow/build/src/test/spin/libshadow-plugin-test-spin.so"/>
  <host id="testhost" quantity="12">
    <process plugin="spin" starttime="1" arguments=""/>
  </host>
</shadow>

If each run of main takes 5 seconds, then running 12 hosts and 12 threads should complete in 5 seconds. But sometimes it takes 2 cycles (10 seconds) because some of the workers are idle and others end up processing 2 hosts instead of 1. I was able to observe this behavior consistently when using 12 workers, and about 3 workers on average were idle.

Metadata

Metadata

Assignees

Labels

Component: MainComposing the core Shadow executablePriority: CriticalMajor issue that requires immediate attentionStatus: In ProgressWork is currently underwayTag: PerformanceRelated to improving shadow's run-timeType: BugError or flaw producing unexpected results

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions