comment fixes by ryan-williams · Pull Request #7215 · dask/dask

ryan-williams · 2021-02-14T15:50:05Z

~~Tests added / passed~~ (N/A)
Passes black dask / flake8 dask

Factored out of #6661 (DDF.iloc).

ryan-williams · 2021-02-14T17:50:13Z

hmm dask/array/tests/test_linalg.py::test_norm_any_slice[False--1-shape1-chunks1], I suspect this is a flaky test.

Is there a way I can trigger a retry of a given GHA?

jrbourbeau · 2021-02-15T16:35:02Z

Apologies for the inconvenience, test_norm_any_slice is a known flaky tests which we're working to resolve (xref #7189). I just bumped CI

ryan-williams · 2021-02-15T16:38:58Z

Thanks! I also rebased this on top of #7217 to try to get CI passing; can re-rebase once #7220 is in

jrbourbeau

Thanks for the cleanup @ryan-williams!

jrbourbeau · 2021-02-15T16:36:54Z

docs/requirements-docs.txt

@@ -1,5 +1,5 @@
 numpydoc
-sphinx
+sphinx<3.5.0


Per the discussion over in #7217

Suggested change

sphinx<3.5.0

sphinx

thanks, removed

jrbourbeau · 2021-02-15T16:41:10Z

dask/dataframe/io/tests/test_io.py


 def test_from_pandas_dataframe():
    a = list("aaaaaaabbbbbbbbccccccc")
+    np.random.seed(123)


This should be testing behavior that is independent of the values in the b column. Did you find that not having the seed set resulting in this test failing?

Yea, agreed that it shouldn't matter here. I added this a while back unfortunately and don't remember the specifics.

My guess is I was debugging general from_pandas failures with my iloc implementation, and found it hard to follow when stepping repeatedly in the debugger when the values weren't the same each time.

Happy to remove if it feels like a distraction (or, to file an issue for a more holistic audit that all random tests are seeded, which I feel like should be the default in general)

jrbourbeau · 2021-02-15T16:51:06Z

dask/dataframe/io/parquet/utils.py

        assert all(
            p[:l] == basepath for p in path_parts_list
        ), "All paths must begin with the given root"
-    l = len(basepath)


It looks like l is used a few lines down in "/".join(path_parts[l:])

yea, it's set in both the if and else branches above though, so this line is redundant

jrbourbeau · 2021-02-15T16:52:18Z

dask/dataframe/tests/test_dataframe.py

-def test_gh580():
-    df = pd.DataFrame({"x": np.arange(10, dtype=float)})
-    ddf = dd.from_pandas(df, 2)
-    assert_eq(np.cos(df["x"]), np.cos(ddf["x"]))
-    assert_eq(np.cos(df["x"]), np.cos(ddf["x"]))


Why remove this?

I started out trying to just add a comment documenting what this test was doing, since it's a bit cryptic, and ultimately decided it was stale/redundant enough to be considered "dead code".

Looking into the history (#580, #582) it seems it was verifying some workaround related to ufuncs that has since been superseded with more correct behavior. I'm not sure if the duplicated assertion is intentional or was ever significant, and I believe e.g. test_ufunc.py covers whatever is being tested here (and then some).

Happy to leave it in, but in that case maybe someone else should try their hand at documenting exactly what this test is doing? If no one can, that seems like a good motivation to remove it, I think?

ryan-williams

thanks, think I responded to everything, will force-push a new rebased version now

ryan-williams · 2021-02-16T14:30:41Z

dask/dataframe/io/parquet/utils.py

        assert all(
            p[:l] == basepath for p in path_parts_list
        ), "All paths must begin with the given root"
-    l = len(basepath)


yea, it's set in both the if and else branches above though, so this line is redundant

ryan-williams · 2021-02-16T14:35:07Z

dask/dataframe/io/tests/test_io.py


 def test_from_pandas_dataframe():
    a = list("aaaaaaabbbbbbbbccccccc")
+    np.random.seed(123)


Yea, agreed that it shouldn't matter here. I added this a while back unfortunately and don't remember the specifics.

My guess is I was debugging general from_pandas failures with my iloc implementation, and found it hard to follow when stepping repeatedly in the debugger when the values weren't the same each time.

Happy to remove if it feels like a distraction (or, to file an issue for a more holistic audit that all random tests are seeded, which I feel like should be the default in general)

ryan-williams · 2021-02-16T14:40:08Z

dask/dataframe/tests/test_dataframe.py

-def test_gh580():
-    df = pd.DataFrame({"x": np.arange(10, dtype=float)})
-    ddf = dd.from_pandas(df, 2)
-    assert_eq(np.cos(df["x"]), np.cos(ddf["x"]))
-    assert_eq(np.cos(df["x"]), np.cos(ddf["x"]))


I started out trying to just add a comment documenting what this test was doing, since it's a bit cryptic, and ultimately decided it was stale/redundant enough to be considered "dead code".

Looking into the history (#580, #582) it seems it was verifying some workaround related to ufuncs that has since been superseded with more correct behavior. I'm not sure if the duplicated assertion is intentional or was ever significant, and I believe e.g. test_ufunc.py covers whatever is being tested here (and then some).

Happy to leave it in, but in that case maybe someone else should try their hand at documenting exactly what this test is doing? If no one can, that seems like a good motivation to remove it, I think?

ryan-williams · 2021-02-16T14:40:16Z

docs/requirements-docs.txt

@@ -1,5 +1,5 @@
 numpydoc
-sphinx
+sphinx<3.5.0


thanks, removed

jsignell

This PR would be easier to review if it did one thing. Comment improvements would be merged very quickly while discussion of changes to tests could happen separately.

jsignell · 2021-02-24T16:25:10Z

dask/array/core.py

    if np.isnan(sum(map(sum, blockdims))):
        raise ValueError(
-            "Arrays chunk sizes (%s) are unknown.\n\n"
+            "Array's chunk sizes (%s) are unknown.\n\n"


Suggested change

"Array's chunk sizes (%s) are unknown.\n\n"

"Arrays' chunk sizes (%s) are unknown.\n\n"

ryan-williams · 2021-03-14T23:38:18Z

alright, I trimmed this to just the comment typo fixes. I'll send a PR w/ the dead code removal, but defer to y'all about whether the "obsolete test" removal / random-test-seeding is worth the PR noise.

ryan-williams · 2021-03-14T23:52:25Z

hmm this looks like a spurious failure? pip failed to resolve some dep? Can someone try running it again? (I dl'd a full log just in case)

Run pre-commit/action@v2.0.0
install pre-commit
  /opt/hostedtoolcache/Python/3.9.2/x64/bin/pip install pre-commit
  Collecting pre-commit
    Downloading pre_commit-2.11.1-py2.py3-none-any.whl (187 kB)
  Collecting virtualenv>=20.0.8
    Downloading virtualenv-20.4.2-py2.py3-none-any.whl (7.2 MB)
  Collecting toml
    Downloading toml-0.10.2-py2.py3-none-any.whl (16 kB)
  Collecting pyyaml>=5.1
    Downloading PyYAML-5.4.1-cp39-cp39-manylinux1_x86_64.whl (630 kB)
  ERROR: Exception:
  Traceback (most recent call last):
    File "/opt/hostedtoolcache/Python/3.9.2/x64/lib/python3.9/site-packages/pip/_vendor/resolvelib/resolvers.py", line 171, in _merge_into_criterion
      crit = self.state.criteria[name]
  KeyError: 'pyyaml'

jrbourbeau

Thanks @ryan-williams!

ryan-williams mentioned this pull request Feb 14, 2021

Spurious Documentation GHA failures? #7216

Closed

ryan-williams force-pushed the nits branch from 146ac6c to 047c3e7 Compare February 14, 2021 16:48

jrbourbeau reviewed Feb 15, 2021

View reviewed changes

ryan-williams commented Feb 16, 2021

View reviewed changes

ryan-williams force-pushed the nits branch 3 times, most recently from eacad02 to 0cf1d05 Compare February 17, 2021 20:07

ryan-williams mentioned this pull request Feb 17, 2021

WIP: DataFrame.iloc implementation (backed by partition_sizes member) #6661

Closed

14 tasks

ryan-williams force-pushed the nits branch from 0cf1d05 to 84ce373 Compare February 20, 2021 03:33

jsignell reviewed Feb 24, 2021

View reviewed changes

Base automatically changed from master to main March 8, 2021 20:19

ryan-williams force-pushed the nits branch from 84ce373 to c5fc6a9 Compare March 14, 2021 23:36

ryan-williams mentioned this pull request Mar 14, 2021

dead code removal / fixes #7388

Merged

3 tasks

ryan-williams mentioned this pull request Mar 15, 2021

ddf.iloc part 1: add partition_sizes member (typically unset), testing utils #7396

Closed

3 tasks

ryan-williams changed the title ~~comment fixes, dead code removal~~ comment fixes Mar 15, 2021

comment typo fixes

1b83a53

ryan-williams force-pushed the nits branch from c5fc6a9 to 1b83a53 Compare March 15, 2021 17:37

jrbourbeau approved these changes Mar 15, 2021

View reviewed changes

jrbourbeau merged commit a1187b1 into dask:main Mar 15, 2021

ryan-williams deleted the nits branch March 15, 2021 18:35

	"Array's chunk sizes (%s) are unknown.\n\n"
	"Arrays' chunk sizes (%s) are unknown.\n\n"

Uh oh!

Conversation

ryan-williams commented Feb 14, 2021

Uh oh!

ryan-williams commented Feb 14, 2021

Uh oh!

jrbourbeau commented Feb 15, 2021

Uh oh!

ryan-williams commented Feb 15, 2021

Uh oh!

jrbourbeau left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan-williams left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsignell left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan-williams commented Mar 14, 2021

Uh oh!

ryan-williams commented Mar 14, 2021

Uh oh!

jrbourbeau left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jsignell left a comment •

edited

Loading