Skip to content

[xray] Corrects Error Handling During Push and Pull.#2059

Merged
pcmoritz merged 3 commits intoray-project:masterfrom
elibol:om_cleanup
May 18, 2018
Merged

[xray] Corrects Error Handling During Push and Pull.#2059
pcmoritz merged 3 commits intoray-project:masterfrom
elibol:om_cleanup

Conversation

@elibol
Copy link
Copy Markdown
Contributor

@elibol elibol commented May 14, 2018

This PR corrects error handling in the ObjectManager.

  1. This removes some checks so that bad statuses during a Pull are fatal.
  2. This also adds checks to certain statuses we expect never to fail during Push so that a Push either fails or returns after releasing references to objects and client connections.

Makes a bad status during Push fatal.
@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5386/
Test FAILed.

}
} else {
RAY_LOG(ERROR) << "Buffer Create Failed: " << chunk_status.second.message();
RAY_LOG(ERROR) << "Create Chunk Failed index=" << chunk_index << ": "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do "Create Chunk Failed index = " here, it's slightly prettier :)

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5413/
Test PASSed.

RAY_CHECK_OK(PullSendRequest(object_id, async_conn));
Status pull_send_status = PullSendRequest(object_id, async_conn);
if (!pull_send_status.ok()) {
RAY_LOG(ERROR) << pull_send_status.message();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why log the error here, but not after the call to PullSendRequest below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make this a hard fail.

return status;
}
// Currently, acquiring a connection should not fail.
RAY_CHECK_OK(status);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this not fail?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a note about why this should not fail.

@elibol elibol changed the title Corrects Error Handling During Push and Pull. [xray] Corrects Error Handling During Push and Pull. May 16, 2018
@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/5440/
Test PASSed.

Copy link
Copy Markdown
Contributor

@stephanie-wang stephanie-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but at some point we should go over all of the object manager components and determine which should really return a ray::Status.

@pcmoritz pcmoritz merged commit 3c245f6 into ray-project:master May 18, 2018
alok added a commit to alok/ray that referenced this pull request May 18, 2018
* master: (22 commits)
  [xray] Fix bug in updating actor execution dependencies (ray-project#2064)
  [DataFrame] Refactor __delitem__ (ray-project#2080)
  [xray] Better error messaging when pulling from self. (ray-project#2068)
  Use source code in hash where possible (fix ray-project#2089) (ray-project#2090)
  Functions for flushing done tasks and evicted objects. (ray-project#2033)
  Fix compilation error for RAY_USE_NEW_GCS with latest clang. (ray-project#2086)
  [xray] Corrects Error Handling During Push and Pull. (ray-project#2059)
  [xray] Sophisticated task dependency management (ray-project#2035)
  Support calling positional arguments by keyword (fix ray-project#998) (ray-project#2081)
  [DataFrame] Improve performance of iteration methods (ray-project#2026)
  [DataFrame] Implement to_csv (ray-project#2014)
  [xray] Lineage cache only requests notifications about remote parent tasks (ray-project#2066)
  [rllib] Add magic methods for rollouts (ray-project#2024)
  [DataFrame] Allows DataFrame constructor to take in another DataFrame (ray-project#2072)
  Pin Pandas version for Travis to 0.22 (ray-project#2075)
  Fix python linting (ray-project#2076)
  [xray] Fix GCS table prefixes (ray-project#2065)
  Some tests for _submit API. (ray-project#2062)
  [rllib] Queue lib for python 2.7 (ray-project#2057)
  [autoscaler] Remove faulty assert that breaks during downscaling, pull configs from env (ray-project#2006)
  ...
alok added a commit to alok/ray that referenced this pull request May 21, 2018
* master: (24 commits)
  Performance fix (ray-project#2110)
  Use flake8-comprehensions (ray-project#1976)
  Improve error message printing and suppression. (ray-project#2104)
  [rllib] [doc] Broken link in ddpg doc
  YAPF, take 3 (ray-project#2098)
  [rllib] rename async -> _async (ray-project#2097)
  fix unused lambda capture (ray-project#2102)
  [xray] Use pubsub instead of timeout for ObjectManager Pull. (ray-project#2079)
  [DataFrame] Update _inherit_docstrings (ray-project#2085)
  [JavaWorker] Changes to the build system for support java worker (ray-project#2092)
  [xray] Fix bug in updating actor execution dependencies (ray-project#2064)
  [DataFrame] Refactor __delitem__ (ray-project#2080)
  [xray] Better error messaging when pulling from self. (ray-project#2068)
  Use source code in hash where possible (fix ray-project#2089) (ray-project#2090)
  Functions for flushing done tasks and evicted objects. (ray-project#2033)
  Fix compilation error for RAY_USE_NEW_GCS with latest clang. (ray-project#2086)
  [xray] Corrects Error Handling During Push and Pull. (ray-project#2059)
  [xray] Sophisticated task dependency management (ray-project#2035)
  Support calling positional arguments by keyword (fix ray-project#998) (ray-project#2081)
  [DataFrame] Improve performance of iteration methods (ray-project#2026)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants