Commit f577913
authored
Avoid sorting large stacks in order (#3298)
When performning task ordering we sort tasks based on the
number of dependents/dependencies they have. This is critical to
low-memory processing.
However, sometimes individual tasks have millions of dependencies,
for which an n*log(n) sort adds significant overhead. In these cases
we give up on sorting, and just hope that the tasks are well ordered
naturally (such as is often the case in Python 3.6+ due to sorted
dicts and the natural ordering that exists when constructing common
graphs)
See pangeo-data/pangeo#150 (comment)
for a real-world case1 parent 10c46a1 commit f577913
2 files changed
Lines changed: 11 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
113 | 116 | | |
114 | 117 | | |
115 | 118 | | |
| |||
120 | 123 | | |
121 | 124 | | |
122 | 125 | | |
123 | | - | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
124 | 129 | | |
125 | 130 | | |
126 | 131 | | |
| |||
130 | 135 | | |
131 | 136 | | |
132 | 137 | | |
133 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
134 | 141 | | |
135 | 142 | | |
136 | 143 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
0 commit comments