Unskip all but `test_redeploy_single_replica` in `test_deploy.py` by czgdp1807 · Pull Request #21391 · ray-project/ray

czgdp1807 · 2022-01-05T09:47:36Z

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

czgdp1807 · 2022-01-05T09:50:10Z

python/ray/serve/tests/test_deploy.py

+        rets = ret.split("|")
+        if len(rets) == 2:
+            return rets[0], rets[1]
+        elif len(rets) == 1:
+            return rets[0], "<NULL>"
+        else:
+            return "<NULL>", "<NULL>"


An IndexError was being raised in one of the processes executing this task. I added this to handle the corner cases. I can create a utility function to do this and replace return ret.split("|")[0], ret.split("|")[1] with a call to that function. Let me know. cc: @pcmoritz @simon-mo

Do you happen to remember what value ret was? Was the IndexError caused because ret.split("|")[1] failed? I want to make sure that the corner case wasn't actually an issue with the test's correctness.

I want to make sure that the corner case wasn't actually an issue with the test's correctness.

Yes. I will re-run the tests tomorrow (without this change) and post the logs here.

File "python\ray\_raylet.pyx", line 640, in ray._raylet.execute_task outputs = function_executor(*args, **kwargs) File "C:\Users\gagan\ray_project\ray\python\ray\serve\tests\test_deploy.py", line 675, in call return ret.split("|")[0], ret.split("|")[1] IndexError: list index out of range pid=2196) 2022-01-06 08:14:44,896 INFO deployment_state.py:939 -- Removing 4 replicas from deployment 'test'. component=serve deployment=test pid=2588) ret: Path '/test' not found. Please ping http://.../-/routes for route table. pid=7964) ret: 2|564 pid=3992) ret: Path '/test' not found. Please ping http://.../-/routes for route table. pid=4020) ret: Path '/test' not found. Please ping http://.../-/routes for route table. pid=11184) ret: 2|7256 pid=10512) ret: Path '/test' not found. Please ping http://.../-/routes for route table. pid=4988) ret: Path '/test' not found. Please ping http://.../-/routes for route table. pid=7044) ret: 2|7076 pid=6088) ret: 2|9404 pid=7968) ret: 2|9404 ray\serve\tests\test_deploy.py::test_redeploy_scale_up[False]

I think Path '/test' not found. Please ping http://.../-/routes for route table. means that deployment didn't complete when requests.get got executed and accessed "http://localhost:8000/test"? I think I should also return timestamps for debugging purposes. That might help in figuring out when the v1 and v2 objects were deployed and when actually the requests.get accessed "http://localhost:8000/test" the location. Though we also need to see if Path '/test' not found. Please ping http://.../-/routes for route table. is anyways affecting the test.

P.S. Result of another race condition may be?

czgdp1807 · 2022-01-05T18:06:02Z

python/ray/serve/tests/test_deploy.py

        responses = defaultdict(set)
        start = time.time()
-        while time.time() - start < 30:
+        while time.time() - start < 200:


I can make this change Windows specific. Let me know if the idea behind this bump is acceptable. Windows is slow so may be not much can be done about this other than increasing time limits.

Yep, the bump is OK. Can we make it Windows specific?

Sure. I will do that.

shrekris-anyscale · 2022-01-05T18:45:05Z

python/ray/serve/tests/test_deploy.py

                }).text

-        return ret.split("|")[0], ret.split("|")[1]
+        rets = {key: value for key, value in enumerate(ret.split("|"))}


What's the reason we need to use a list comprehension and enumerate here instead of directly using split? Is it a Windows-related change?

N.B. - #21391 (comment)

I have explained the reason for this change in the above comment. I used dict for a two line change. I was using if-else if checks earlier.

Got it, I added a question to that thread. I want to make sure the corner case doesn't affect the test's correctness.

N.B. - #21391 (comment)

shrekris-anyscale · 2022-01-05T18:50:48Z

python/ray/serve/tests/test_deploy.py

        responses = defaultdict(set)
        start = time.time()
-        while time.time() - start < 30:
+        while time.time() - start < 200:


Yep, the bump is OK. Can we make it Windows specific?

shrekris-anyscale · 2022-01-05T18:55:28Z

python/ray/serve/tests/test_deploy.py

        responses = defaultdict(set)
        start = time.time()
-        while time.time() - start < 30:
+        while time.time() - start < 100:


Could we make this Windows-specific also?

shrekris-anyscale · 2022-01-05T18:55:41Z

python/ray/serve/tests/test_deploy.py

        responses = defaultdict(set)
        start = time.time()
-        while time.time() - start < 30:
+        while time.time() - start < 200:


Could we make this Windows-specific also?

shrekris-anyscale · 2022-01-05T18:56:04Z

python/ray/serve/tests/test_deploy.py

        responses = defaultdict(set)
        start = time.time()
-        while time.time() - start < 30:
+        while time.time() - start < 100:


Could we make this Windows specific?

czgdp1807 · 2022-01-06T09:49:41Z

python/ray/serve/tests/test_deploy.py

                val, pid = ray.get(ref)
-                responses[val].add(pid)
+                if val != "<NULL>" and pid != "<NULL>":
+                    responses[val].add(pid)


I have added this check so that behaviour before this change is retained without throwing into IndexError.

czgdp1807 · 2022-01-06T16:26:13Z

By keeping the timeout limit 500 the tests pass everywhere in test_deploy.py on Windows.

simon-mo · 2022-01-07T20:43:30Z

The timeout sounds good. However I'm worry about the readability of the change.

The not found error can be programmatically detected via requests.get().status_code == 404. In this case, you can just return (None, None) instead of special value here. And in the accumulation of responses[val].add(pid) you can just skip it.

czgdp1807 · 2022-01-08T09:16:29Z

python/ray/serve/tests/test_deploy.py

+            ret_candidate = ret.text.split("|")
+
+            # Set return value to (None, None)
+            # if status_code is 404 or if there are
+            # not enough values to return
+            if ret.status_code == 404 or len(ret_candidate) != 2:
+                ret = (None, None)
+            else:
+                ret = ret_candidate
+
+        return ret[0], ret[1]


Let me know if this approach looks good. I will apply this to other test cases as well.

This approach looks good to me. cc @simon-mo

simon-mo

There's still <NULL> in different places. It should be changed
from

rets = {key: value for key, value in enumerate(ret.split("|"))}
        return rets.get(0, "<NULL>"), rets.get(1, "<NULL>")

to

if requests.get(f"http://localhost:8000/{name}").status_code != 200:
    return None, None

simon-mo · 2022-01-14T18:28:20Z

Please ping when this is ready to merge! Thank you

czgdp1807 · 2022-01-14T19:31:57Z

Windows and lint test pass on my latest commit.

czgdp1807 · 2022-01-21T12:49:23Z

@simon-mo Anything else to be done here? Lint tests are failing across all my (and I think other PRs too). It is due to some SphinxError.

bveeramani · 2022-01-30T05:14:52Z

‼️ ACTION REQUIRED ‼️

We've switched our code formatter from YAPF to Black (see #21311).

To prevent issues with merging your code, here's what you'll need to do:

Install Black

pip install -I black==21.12b0

Format changed files with Black

curl -o format-changed.sh https://gist.githubusercontent.com/bveeramani/42ef0e9e387b755a8a735b084af976f2/raw/7631276790765d555c423b8db2b679fd957b984a/format-changed.sh
chmod +x ./format-changed.sh
./format-changed.sh
rm format-changed.sh

Commit your changes.

git add --all
git commit -m "Format Python code with Black"

Merge master into your branch.

git pull upstream master

Resolve merge conflicts (if necessary).

After running these steps, you'll have the updated format.sh.

simon-mo

LGTM

simon-mo · 2022-02-01T18:31:41Z

@czgdp1807 Would you mind perform the code style change and merge master? The diff looks good to me and we just need to get it merged.

…a` in `test_deploy.py` (#21391)" This reverts commit 000c56f.

…a` in `test_deploy.py` (#21391)" (#22299) This reverts commit 000c56f.

…est_deploy.py` (ray-project#21391)

…a` in `test_deploy.py` (ray-project#21391)" (ray-project#22299) This reverts commit 000c56f.

…est_deploy.py` (ray-project#21391)

…a` in `test_deploy.py` (ray-project#21391)" (ray-project#22299) This reverts commit 000c56f.

Bumped timeout and IndexError

68a1f79

czgdp1807 added windows testing topics about testing labels Jan 5, 2022

czgdp1807 commented Jan 5, 2022

View reviewed changes

Unskipped all but one in test_deploy.py

eb62639

czgdp1807 changed the title ~~Unskip test_redeploy_scale_down~~ Unskip all but test_redeploy_single_replica in test_deploy.py Jan 5, 2022

czgdp1807 commented Jan 5, 2022

View reviewed changes

simon-mo assigned shrekris-anyscale Jan 5, 2022

shrekris-anyscale reviewed Jan 5, 2022

View reviewed changes

czgdp1807 added 3 commits January 6, 2022 00:32

Made timeout values windows specific

7fb41e0

Don't add <NULL> values to responses

04189a6

Bump timeout_value to 500 for all

90b9c6c

czgdp1807 commented Jan 6, 2022

View reviewed changes

simon-mo requested a review from shrekris-anyscale January 6, 2022 18:08

czgdp1807 added 2 commits January 8, 2022 14:44

Modified corner case handling

ecd5968

Applied linting changes

e7eb065

czgdp1807 commented Jan 8, 2022

View reviewed changes

simon-mo assigned simon-mo and unassigned shrekris-anyscale Jan 12, 2022

simon-mo requested changes Jan 12, 2022

View reviewed changes

czgdp1807 added 5 commits January 13, 2022 15:50

Simplified checks

955b811

Syntactical corrections

1039ef5

NULL -> None checks

bc96a10

Merge branch 'master' into deploy_601

86cee73

Removed redefined import

74ef08b

simon-mo added the @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. label Jan 14, 2022

simon-mo removed the @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. label Jan 14, 2022

Merge branch 'master' into deploy_601

9e1f12a

Merge branch 'master' into deploy_601

e292af7

simon-mo approved these changes Feb 1, 2022

View reviewed changes

czgdp1807 added 2 commits February 2, 2022 12:31

resolved conflicts

4dab816

Format Python code with Black

af9a045

czgdp1807 requested a review from simon-mo February 2, 2022 10:09

simon-mo merged commit 000c56f into ray-project:master Feb 9, 2022

architkulkarni added a commit that referenced this pull request Feb 10, 2022

Revert "[Serve] [Windows] Unskip all but `test_redeploy_single_replic…

88583ef

…a` in `test_deploy.py` (#21391)" This reverts commit 000c56f.

architkulkarni mentioned this pull request Feb 10, 2022

Revert "Unskip all but test_redeploy_single_replica in test_deploy.py" #22299

Merged

simon-mo pushed a commit that referenced this pull request Feb 10, 2022

Revert "[Serve] [Windows] Unskip all but `test_redeploy_single_replic…

94f73de

…a` in `test_deploy.py` (#21391)" (#22299) This reverts commit 000c56f.

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

[Serve] [Windows] Unskip all but test_redeploy_single_replica in `t…

fc3df7b

…est_deploy.py` (ray-project#21391)

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

Revert "[Serve] [Windows] Unskip all but `test_redeploy_single_replic…

0b44c91

…a` in `test_deploy.py` (ray-project#21391)" (ray-project#22299) This reverts commit 000c56f.

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Mar 8, 2022

[Serve] [Windows] Unskip all but test_redeploy_single_replica in `t…

f240003

…est_deploy.py` (ray-project#21391)

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Mar 8, 2022

Revert "[Serve] [Windows] Unskip all but `test_redeploy_single_replic…

48c4046

…a` in `test_deploy.py` (ray-project#21391)" (ray-project#22299) This reverts commit 000c56f.

Conversation

czgdp1807 commented Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shrekris-anyscale Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Jan 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czgdp1807 Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

czgdp1807 commented Jan 6, 2022

Uh oh!

simon-mo commented Jan 7, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

simon-mo commented Jan 14, 2022

Uh oh!

czgdp1807 commented Jan 14, 2022

Uh oh!

czgdp1807 commented Jan 21, 2022

Uh oh!

bveeramani commented Jan 30, 2022

‼️ ACTION REQUIRED ‼️

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

simon-mo commented Feb 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

czgdp1807 commented Jan 5, 2022 •

edited

Loading

shrekris-anyscale Jan 5, 2022 •

edited

Loading

czgdp1807 Jan 5, 2022 •

edited

Loading

czgdp1807 Jan 6, 2022 •

edited

Loading

czgdp1807 Jan 5, 2022 •

edited

Loading