Skip to content

Properly short circuit core worker Get() on exception#5672

Merged
pcmoritz merged 3 commits intoray-project:masterfrom
edoakes:short-circuit
Sep 12, 2019
Merged

Properly short circuit core worker Get() on exception#5672
pcmoritz merged 3 commits intoray-project:masterfrom
edoakes:short-circuit

Conversation

@edoakes
Copy link
Copy Markdown
Collaborator

@edoakes edoakes commented Sep 10, 2019

Why are these changes needed?

We should be short circuiting calls to Get() in the core worker when we see an exception, but this wasn't propagated out of the plasma store provider (was causing some Java test errors).

Checks

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16926/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16936/
Test FAILed.

@zhijunfu
Copy link
Copy Markdown
Contributor

LGTM, thanks. I left a minor comment.

const TaskID &task_id,
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results) {
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results,
bool *got_exception) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got_exception can be removed since it can be determined from the result map when there's an exception?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but that would require iterating over the full list of results even in success cases, which I would want to avoid especially because the IsException check loops over a list of types itself.

std::string error_string = std::to_string(ray::rpc::TASK_EXECUTION_EXCEPTION);
char error_buffer[error_string.size()];
size_t len = error_string.copy(error_buffer, error_string.size(), 0);
buffers_with_exception.emplace_back(
Copy link
Copy Markdown
Contributor

@pcmoritz pcmoritz Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the above just a very intricate way of doing error_string.data() below?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, we need a non-const reference to the data. We could also do const_cast if you'd prefer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah no, let's not do that. Probably a const version of the constructor would be in order

LocalMemoryBuffer(uint8_t *data, size_t size, bool copy_data = false)
and an immutable version of the buffer, let's leave as is for this PR.

const TaskID &task_id,
std::unordered_map<ObjectID, std::shared_ptr<RayObject>> *results) = 0;
/// \param[out] got_exception Set to true if any of the fetched results were an
/// exception. \return Status.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline before \return Status.

@pcmoritz pcmoritz merged commit 0bf79cf into ray-project:master Sep 12, 2019
@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16985/
Test FAILed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants