Config toggle to disable blockwise fusion#10909
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ± 0 15 suites ±0 3h 14m 46s ⏱️ - 1m 35s Results for commit a07d7da. ± Comparison against base commit 7726d31. ♻️ This comment has been updated with latest results. |
|
I am not a fan of exposing this publicly, the reuse keys stuff goes away for DataFrames with dask-expr and I don't want to respect this option there. Disabling blockwise fusion kills the scheduler as soon as you want to scale out |
|
@fjetter @hendrikmakait opinions? IMHO this is a powerful debugging tool, at the very least. |
| assert len(graph) <= 4 | ||
|
|
||
|
|
||
| def test_blockwise_optimize_toggle(): |
There was a problem hiding this comment.
I'm a bit confused as to why this test doesn't fail with dask-expr enabled? @phofl ideas?
There was a problem hiding this comment.
This doesn't go through the dask-expr optimised, you'll end up with an un-optimized expression that is materialised and then goes through the dask/dask blockwise optimise
You would have to call ddf.optimize() if you want to go through the dask-expr optimiser. We should probably fix this FWIW in here as well
d96af31 to
a07d7da
Compare
|
I don't expect anyone to actually use this configuration. I think we do not have any kind of documentation about blockwise fusion so this public config is confusing. Even worse, I can't come up with a situation where I would recommend this to anybody. Even if a user is affected by #9888 there are other mitigation strategies that are generally more user friendly than disabling fusion. I'm also not convinced that debugging this is such a common practice that we need a proper toggle for it. For dask devs, it should be sufficient to comment this stuff out, it's two lines after all. |
Useful to debug / work around #9888