[SPARK-11137][Streaming] Make StreamingContext.stop() exception-safe #10807

jayadevanmurali · 2016-01-18T17:02:14Z

Make StreamingContext.stop() exception-safe

jayadevanmurali · 2016-01-18T17:18:02Z

Exceptions are handled for each of the operations in the stop method, so that any exceptions does not abort rest of the statements

srowen · 2016-01-18T17:49:59Z

@jayadevanmurali provide a better title/description please. "Update" doesn't say what this is about. It matters since it becomes the commit message

jayadevanmurali · 2016-01-18T18:07:56Z

@srowen sure and updated the title

srowen · 2016-01-18T20:33:00Z

LGTM

SparkQA · 2016-01-18T21:33:26Z

Test build #2399 has finished for PR 10807 at commit 39bcfb5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2016-01-19T05:17:42Z

streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala

I think one of the primary goal of this JIRA is to allow partial clean-up and retry on stop() calls.

In this specific code path, it is already written in a way to allow for retry by setting the state to STOPPED only almost at the end on line 728 in the original code.

tryLogNonFatalError swallows and logs "non-fatal" exception, and with that added, despite any non-critical error thrown it could reach the line state = STOPPED. For instance, if env.metricsSystem.removeSource() throws then it will continue on and setting state to STOPPED, at which point the caller cannot get back to the same code to retry cleanup because of the state match case above.

Is that what we want?

My interpretation of the purpose was a little different. If cleanup involves running A, B and C, then we want B and C to try to run even if A fails. I didn't think we necessarily expected the caller to re-try stop() since I don't think that's usual. Yes, a fatal error will still cause the whole thing to stop but in my mind a fatal error means lots of bets are off. This is also for consistency with how SparkContext.stop() works.

jayadevanmurali · 2016-01-20T13:03:06Z

@srowen Thanks for the detailed comment. Can you merge the changes to master branch ?

srowen · 2016-01-21T15:38:45Z

@felixcheung WDYT?

felixcheung · 2016-01-22T01:19:52Z

@srowen it's good - I agree asking the caller to retry stop() or even put it in a loop is too far off. It might happen when someone is running from spark-shell or similar but likely it's an edge case.
Try to run as much as we could at stop() seems fair.

thomastechs · 2016-01-22T04:57:07Z

This fix would be executed in similar consistent way it has been fixed in SparkContext.stop() .I think this is fine.

srowen · 2016-01-23T11:49:43Z

Merged to master

Update StreamingContext.scala [SPARK-11137]

39bcfb5

Make StreamingContext.stop() exception-safe

jayadevanmurali changed the title ~~Update StreamingContext.scala [SPARK-11137]~~ [SPARK-11137][Streaming] Make StreamingContext.stop() exception-safe Jan 18, 2016

felixcheung reviewed Jan 19, 2016
View reviewed changes

asfgit closed this in 5f56980 Jan 23, 2016

[SPARK-11137][Streaming] Make StreamingContext.stop() exception-safe #10807

[SPARK-11137][Streaming] Make StreamingContext.stop() exception-safe #10807

Uh oh!

Conversation

jayadevanmurali commented Jan 18, 2016

Uh oh!

jayadevanmurali commented Jan 18, 2016

Uh oh!

srowen commented Jan 18, 2016

Uh oh!

jayadevanmurali commented Jan 18, 2016

Uh oh!

srowen commented Jan 18, 2016

Uh oh!

SparkQA commented Jan 18, 2016

Uh oh!

felixcheung Jan 19, 2016

Choose a reason for hiding this comment

Uh oh!

srowen Jan 19, 2016

Choose a reason for hiding this comment

Uh oh!

jayadevanmurali commented Jan 20, 2016

Uh oh!

srowen commented Jan 21, 2016

Uh oh!

felixcheung commented Jan 22, 2016

Uh oh!

thomastechs commented Jan 22, 2016

Uh oh!

srowen commented Jan 23, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants