Remove redundant normalize_token variants by crusaderky · Pull Request #10884 · dask/dask

crusaderky · 2024-02-01T18:13:05Z

Now that we use pickle to tokenize unknown objects, we can remove a lot of special cases.

Note 1: performance for numpy tokenization is ensured by using pickle5 buffers in _normalize_pickle.
Note 2: I tried removing all special-case handling for pandas, but it broke gpuci. I did not spend time to investigate. Probably a worthy exercise to do at some point later.

github-actions · 2024-02-01T19:00:31Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

15 files ±0 15 suites ±0 3h 24m 5s ⏱️ + 7m 47s
13 021 tests ±0 12 090 ✅ - 1 931 💤 + 1 0 ❌ ±0
161 002 runs ±0 144 495 ✅ - 17 16 507 💤 +17 0 ❌ ±0

Results for commit d2208b5. ± Comparison against base commit 07099e5.

This pull request removes 2 and adds 2 tests. Note that renamed tests count towards both.

dask.tests.test_tokenize ‑ test_tokenize_local_classes_from_different_contexts
dask.tests.test_tokenize ‑ test_tokenize_local_instances_from_different_contexts

dask.tests.test_tokenize ‑ test_tokenize_local_classes_from_different_contexts[False]
dask.tests.test_tokenize ‑ test_tokenize_local_classes_from_different_contexts[True]

This pull request removes 1 skipped test and adds 2 skipped tests. Note that renamed tests count towards both.

dask.tests.test_tokenize ‑ test_tokenize_local_instances_from_different_contexts

dask.tests.test_tokenize ‑ test_tokenize_local_classes_from_different_contexts[False]
dask.tests.test_tokenize ‑ test_tokenize_local_classes_from_different_contexts[True]

♻️ This comment has been updated with latest results.

phofl · 2024-02-14T10:45:08Z

Just double checking: Should we run performance tests for this?

crusaderky · 2024-02-14T12:24:37Z

Just double checking: Should we run performance tests for this?

Already did. No noticeable regression in the end-to-end coiled/benchmarks, and 50~150ms slowdown overall in the TCPH optimizer runtime (most of it caused by #10883, I expect).

phofl · 2024-02-14T13:35:10Z

thx

crusaderky force-pushed the simpler_tokenize branch 2 times, most recently from 9febbaf to 4dff916 Compare February 1, 2024 18:30

crusaderky force-pushed the simpler_tokenize branch 6 times, most recently from b0a9ef5 to 9df700d Compare February 5, 2024 15:31

crusaderky changed the title ~~[DNM] Yank out most special handlers for tokenize~~ [DNM] Remove redundant normalize_token variants Feb 5, 2024

crusaderky mentioned this pull request Feb 5, 2024

Remove lambda tokenization hack dask/dask-expr#822

Merged

crusaderky force-pushed the simpler_tokenize branch 13 times, most recently from c70308c to 0cad4d1 Compare February 6, 2024 10:36

crusaderky mentioned this pull request Feb 6, 2024

Test numba tokenization #10896

Merged

crusaderky force-pushed the simpler_tokenize branch from 0cad4d1 to d48e75c Compare February 6, 2024 19:01

crusaderky mentioned this pull request Feb 6, 2024

Tokenization meta-issue #10905

Closed

crusaderky changed the title ~~[DNM] Remove redundant normalize_token variants~~ Remove redundant normalize_token variants Feb 6, 2024

crusaderky force-pushed the simpler_tokenize branch 2 times, most recently from 0cd8fa7 to 04fca10 Compare February 9, 2024 12:11

crusaderky added a commit to crusaderky/dask that referenced this pull request Feb 9, 2024

Remove redundant normalize_token variants (dask#10884)

a3e08fb

crusaderky self-assigned this Feb 9, 2024

crusaderky marked this pull request as ready for review February 9, 2024 15:36

crusaderky force-pushed the simpler_tokenize branch from 04fca10 to 45c0693 Compare February 13, 2024 16:47

Remove redundant normalize_token variants

d2208b5

crusaderky force-pushed the simpler_tokenize branch from 45c0693 to d2208b5 Compare February 13, 2024 17:05

phofl approved these changes Feb 14, 2024

View reviewed changes

phofl merged commit 9a1d4f1 into dask:main Feb 14, 2024

crusaderky deleted the simpler_tokenize branch February 14, 2024 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove redundant normalize_token variants#10884

Remove redundant normalize_token variants#10884
phofl merged 1 commit intodask:mainfrom
crusaderky:simpler_tokenize

crusaderky commented Feb 1, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Feb 1, 2024 •

edited

Loading

Uh oh!

phofl commented Feb 14, 2024

Uh oh!

crusaderky commented Feb 14, 2024

Uh oh!

phofl commented Feb 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

crusaderky commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Results

Uh oh!

phofl commented Feb 14, 2024

Uh oh!

crusaderky commented Feb 14, 2024

Uh oh!

phofl commented Feb 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

crusaderky commented Feb 1, 2024 •

edited

Loading

github-actions bot commented Feb 1, 2024 •

edited

Loading