ARROW-17381: [C++][Acero] Centralize error handling in ExecPlan #13848

save-buffer · 2022-08-10T22:44:51Z

No description provided.

github-actions · 2022-08-10T22:45:14Z

https://issues.apache.org/jira/browse/ARROW-17381

github-actions · 2022-08-10T22:45:16Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

westonpace

I like the cleanup, this is definitely simplifying ExecNode/ExecPlan. I have some initial thoughts.

westonpace · 2022-08-11T23:36:10Z

cpp/src/arrow/compute/exec/exec_plan.cc

Suggested change

// COMMIT cd5346e14450d7e5ca156acb4c2f396885c77aa0

westonpace · 2022-08-11T23:37:46Z

cpp/src/arrow/compute/exec/exec_plan.cc

Eventually this case will go away

westonpace · 2022-08-11T23:39:13Z

cpp/src/arrow/compute/exec/exec_plan.cc

Why does the order no longer matter here?

westonpace · 2022-08-11T23:41:24Z

cpp/src/arrow/compute/exec/exec_plan.cc

Wouldn't a more appropriate place to trigger EndTaskGroup be when InputFinished is received on all sinks?

EndTaskGroup has a nice property that it ends when it runs out of tasks to perform, here's the comment:

/// It is allowed for tasks to be added after this call provided the future has not yet /// completed. This should be safe as long as the tasks being added are added as part /// of a task that is tracked. As soon as the count of running tasks reaches 0 this /// future will be marked complete.

So we will end when all of the tasks have finished running and no new tasks have been scheduled.

westonpace · 2022-08-11T23:42:03Z

cpp/src/arrow/compute/exec/exec_plan.cc

Wouldn't we call node->Abort when we transition to aborted_ = true?

We want to avoid any possible race conditions while aborting/doing cleanup and running tasks, so it's only safe to Abort when we're sure that no other tasks are running.

westonpace · 2022-08-11T23:45:11Z

cpp/src/arrow/compute/exec/exec_plan.cc

Very happy to see this move into the base class.

westonpace · 2022-08-11T23:49:03Z

cpp/src/arrow/compute/exec/project_node.cc

At this point maybe we should just move the body of DoProject into this method?

westonpace · 2022-08-11T23:49:26Z

cpp/src/arrow/compute/exec/project_node.cc

Can this be a default implementation for ExecNode::InputFinished?

Yeah it probably can be. Actually this span thing is a bit broken right now in general because we don't enforce that InputFinished is called after all batches have been output. InputFinished is merely to specify the total number of batches that will be output, so e.g. in the case of scalar aggregates that output only one row ever, InputFinished is called in StartProducing, and so a project above a scalar aggregate node would be ended immediately.

westonpace · 2022-08-11T23:51:29Z

cpp/src/arrow/compute/exec/exec_plan.h

Can we keep the comment?

westonpace · 2022-08-11T23:51:51Z

cpp/src/arrow/compute/exec/exec_plan.h

What does Abort execution mean for a node? In theory all "execution" is handled via the scheduler so does a node really need to do anything here? Why ExecNode::Abort instead of doing the cleanup in the ExecNode destructor?

westonpace · 2022-08-11T23:56:04Z

@zagto do you mind taking a look at this when you get a chance?

zagto

Nice work. I love seeing the code becoming cleaner and easier to unterstand.

zagto · 2022-08-18T16:19:20Z

cpp/src/arrow/compute/exec/exec_plan.cc

I don't think this std::move does anything, given that status is a const reference.

zagto · 2022-08-18T16:33:20Z

cpp/src/arrow/compute/exec/asof_join_node.cc

If we get a non-ok status here, would that mean we just abort while discarding the Status/message? This seems confusing to the user. Maybe we could have an ExecPlan::Abort(Status) that adds the status to ExecPlanImpl::errors_?

zagto · 2022-08-18T16:48:13Z

cpp/src/arrow/compute/exec/filter_node.cc

Suggested change

auto values = batch.values;

auto values = std::move(batch.values);

zagto · 2022-08-18T17:00:16Z

cpp/src/arrow/compute/exec/hash_join_node.cc

was this intentional?

zagto · 2022-08-18T17:13:04Z

cpp/src/arrow/compute/exec/plan_test.cc

Why do we 3 calls to SleepABit? Probably because one may not be enough on slower systems, but I think a comment would be helpful here

westonpace

Are you interested in dusting this off and rebasing now that the previous cleanup has merged?

westonpace · 2022-12-26T15:24:46Z

cpp/src/arrow/compute/exec/exec_plan.h

-  /// \brief Stop producing definitively to a single output
-  ///
-  /// This call is a hint that an output node has completed and is not willing
-  /// to receive any further data.
-  virtual void StopProducing(ExecNode* output) = 0;


I've since learned that this is still needed. This covers the case where a LIMIT X node is placed on one branch of a query. It is intended to stop part of the plan but not abort the entire plan. Do you think we can leave it in?

westonpace · 2022-12-30T15:45:44Z

@save-buffer are you interested in rebasing this?

amol- · 2023-03-30T17:13:03Z

Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍

github-actions bot added the Component: C++ label Aug 10, 2022

save-buffer force-pushed the sasha_errors branch 7 times, most recently from 9dbb7e2 to 25dbf30 Compare August 11, 2022 04:25

github-actions bot added the Component: GLib label Aug 11, 2022

save-buffer force-pushed the sasha_errors branch from 25dbf30 to 2cdc239 Compare August 11, 2022 17:29

westonpace reviewed Aug 11, 2022

View reviewed changes

westonpace self-requested a review August 11, 2022 23:56

zagto reviewed Aug 18, 2022

View reviewed changes

save-buffer force-pushed the sasha_errors branch 2 times, most recently from 1cc334d to 279bf83 Compare October 3, 2022 19:17

Part 2 of refactor

1c75db4

save-buffer force-pushed the sasha_errors branch from 279bf83 to 1c75db4 Compare October 4, 2022 19:39

westonpace reviewed Dec 26, 2022

View reviewed changes

westonpace mentioned this pull request Dec 30, 2022

[C++] Simplify ExecNode contract by removing the concept of "node finished" #15130

Open

westonpace mentioned this pull request Jan 8, 2023

GH-32653: [C++] Cleanup error handling in execution engine #15253

Merged

asfimport mentioned this pull request Jan 23, 2023

[C++] Centralize Errors in ExecPlan #32653

Closed

amol- closed this Mar 30, 2023

	auto values = batch.values;
	auto values = std::move(batch.values);

ARROW-17381: [C++][Acero] Centralize error handling in ExecPlan #13848

ARROW-17381: [C++][Acero] Centralize error handling in ExecPlan #13848

Uh oh!

Conversation

save-buffer commented Aug 10, 2022

Uh oh!

github-actions bot commented Aug 10, 2022

Uh oh!

github-actions bot commented Aug 10, 2022

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

save-buffer Aug 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

save-buffer Aug 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

westonpace commented Aug 11, 2022

Uh oh!

zagto left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

westonpace commented Dec 30, 2022

Uh oh!

amol- commented Mar 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

save-buffer Aug 13, 2022 •

edited

Loading

save-buffer Aug 13, 2022 •

edited

Loading