Override tokenize.ensure-deterministic config flag#10913
Override tokenize.ensure-deterministic config flag#10913crusaderky merged 3 commits intodask:mainfrom
Conversation
7ce25e9 to
aadcd40
Compare
aadcd40 to
a176d83
Compare
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ± 0 15 suites ±0 3h 23m 44s ⏱️ + 6m 3s Results for commit 10cdfce. ± Comparison against base commit 328d074. ♻️ This comment has been updated with latest results. |
| # tokenize.ensure-deterministic flag, potentially overridden by tokenize() | ||
| _ensure_deterministic: ContextVar[bool] = ContextVar("_ensure_deterministic") | ||
|
|
||
| # Circular reference breaker used by _normalize_seq_func. | ||
| # This variable is recreated anew every time you call tokenize(). Note that this means | ||
| # that you could call tokenize() from inside tokenize() and they would be fully | ||
| # independent. | ||
| # | ||
| # It is a map of {id(obj): (<first seen incremental int>, obj)} which causes an object | ||
| # to be tokenized as ("__seen", <incremental>) the second time it's encountered while | ||
| # traversing collections. A strong reference to the object is stored in the context to | ||
| # prevent ids from being reused by different objects. | ||
| _seen: ContextVar[dict[int, tuple[int, object]]] = ContextVar("_seen") |
There was a problem hiding this comment.
This seems overly complex. Why isn't a single ContextVar sufficient?
There was a problem hiding this comment.
I only noticed now that this was already introduced in #10876
dask/tests/test_tokenize.py
Outdated
| # Test ctx variable cleanup | ||
| with pytest.raises(LookupError): | ||
| _ensure_deterministic.get() |
There was a problem hiding this comment.
This test is about ufuncs. Asserting on this ctxvar feels odd.
There was a problem hiding this comment.
The test may fail if I were to tamper with the ctxvar without using the helper function _maybe_raise_nondeterministic
There was a problem hiding this comment.
This feels like it would be useful as a comment.
a176d83 to
2bef3a3
Compare
dask/tests/test_tokenize.py
Outdated
| # Test ctx variable cleanup | ||
| with pytest.raises(LookupError): | ||
| _ensure_deterministic.get() |
There was a problem hiding this comment.
This feels like it would be useful as a comment.
| assert tokenize(inc, ensure_deterministic=False) != tokenize( | ||
| inc, ensure_deterministic=False | ||
| ) |
There was a problem hiding this comment.
Should this be nested within the block above?
There was a problem hiding this comment.
No. These 3 tests use the same pattern:
with dask.config.set({"tokenize.ensure-deterministic": False}):
tokenize(...)
with dask.config.set({"tokenize.ensure-deterministic": True}):
tokenize(...)
tokenize(..., ensure_determinsitic=False)
tokenize(..., ensure_deterministic=True)This makes them impervious to whatever the config is set to globally.
There was a problem hiding this comment.
My bad, I attributed the inline comment only to this test case, not the one below.
There was a problem hiding this comment.
I also wasn't sure if this should test overriding "whatever" the config is set to globally or overriding any possible config value.
|
@hendrikmakait @fjetter all comments have been addressed |
hendrikmakait
left a comment
There was a problem hiding this comment.
Thanks, @crusaderky! LGTM assuming CI is green
tokenize(ensure_deterministic=True)to force raising exceptiontest_tokenize.pywith either value for the config variable_seenvariable