[SPARK-17875] [BUILD] Remove unneeded direct dependence on Netty 3.x by srowen · Pull Request #15436 · apache/spark

srowen · 2016-10-11T18:11:50Z

What changes were proposed in this pull request?

Remove unneeded direct dependency on Netty 3.x. I left the dependencyManagement entry because some Hadoop libs still use an older version of Netty 3, and I thought it would be weird if the transitive version we reference went backwards. (Note too that Flume declares a direct separate dependency in test scope on Netty 3.4.x)

How was this patch tested?

Existing tests

srowen

CC @zsxwing

srowen · 2016-10-11T18:12:12Z

NOTICE

The license changes are overdue update of the Netty 4.x license from 3.x's version

srowen · 2016-10-11T18:12:43Z

dev/deps/spark-deps-hadoop-2.3

I can't quite figure this out: Hadoop 2.2 and 2.6-2.7 do transitively depend on Netty 3.6.x. 2.3 and 2.4 do not. shrug

I think netty 3 is used by hadoop-nfs: https://issues.apache.org/jira/browse/HADOOP-12415

However, I don't know why the patch for HADOOP-12415 also added netty 3 to hadoop-hdfs...

SparkQA · 2016-10-11T20:25:32Z

Test build #66752 has finished for PR 15436 at commit a5c5c31.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2016-10-12T00:36:37Z

retest this please

SparkQA · 2016-10-12T02:22:09Z

Test build #66779 has finished for PR 15436 at commit a5c5c31.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-12T11:07:11Z

Test build #3326 has finished for PR 15436 at commit a5c5c31.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-12T13:26:17Z

Test build #66813 has finished for PR 15436 at commit 84bdf1b.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-12T16:58:04Z

Test build #66826 has finished for PR 15436 at commit ecc241e.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

EncodePanda · 2016-10-12T21:54:07Z

My question would be: what about updating netty 4.x version as well? Right now it's 4.0.29.Final if I recall correctly, but we could update it to 4.1.3.Final

srowen · 2016-10-13T10:03:49Z

@rabbitonweb We're on 4.0.41 already. 4.1 won't work; see https://issues.apache.org/jira/browse/SPARK-17379

srowen · 2016-10-13T10:05:19Z

@zsxwing I'm getting failures like

ERROR
test_flume_polling_multiple_hosts (pyspark.streaming.tests.FlumePollingStreamTests) ... Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/streaming/tests.py", line 1367, in _testMultipleTimes
    f()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/streaming/tests.py", line 1387, in _testFlumePollingMultipleHosts
    port = self._utils.startSingleSink()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py", line 1133, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py", line 319, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o24677.startSingleSink.
: java.lang.NoClassDefFoundError: org/jboss/netty/channel/ChannelPipelineFactory
    at org.apache.spark.streaming.flume.sink.SparkSink.start(SparkSink.scala:90)
    at org.apache.spark.streaming.flume.PollingFlumeTestUtils.startSingleSink(PollingFlumeTestUtils.scala:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav

I wonder if you encountered this and this is why you had to add Netty 3 back into the overall assembly? I'm still debugging why it isn't finding this from the flume assembly.

srowen · 2016-10-13T16:59:19Z

Well, I'm sure the flume assembly has the classes in question here, and I'm sure it's being added to --jars when pyspark is run for the tests. I'm still trying to figure out what's wrong here.

SparkQA · 2016-10-15T18:32:31Z

Test build #67015 has finished for PR 15436 at commit f49f6a6.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-16T00:05:09Z

Test build #3344 has finished for PR 15436 at commit f49f6a6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-16T10:27:35Z

Test build #3345 has finished for PR 15436 at commit f49f6a6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-16T14:32:51Z

Test build #3346 has started for PR 15436 at commit f49f6a6.

SparkQA · 2016-10-17T17:13:48Z

Test build #3355 has started for PR 15436 at commit f49f6a6.

SparkQA · 2016-10-18T10:17:45Z

Test build #3360 has started for PR 15436 at commit f49f6a6.

SparkQA · 2016-10-19T11:32:11Z

Test build #3362 has finished for PR 15436 at commit f49f6a6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-19T11:37:51Z

Test build #3363 has started for PR 15436 at commit f49f6a6.

srowen · 2016-10-20T12:49:57Z

Well, heck. This all works fine except with Python 3.x tests. I should note that it works for me on Python 2 and 3 on Ubuntu 16. I don't know why yet. Still debugging.

SparkQA · 2016-10-20T13:27:34Z

Test build #67255 has finished for PR 15436 at commit ad88597.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-21T15:02:33Z

Test build #67337 has finished for PR 15436 at commit d4bb9e4.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-10-24T10:23:29Z

@zsxwing I may have to shelve this; I can't figure this out. It passes locally with Python2/3 but always times out with Python3 on Jenkins. No idea ...

zsxwing · 2016-10-24T17:37:25Z

@srowen agreed. Since Hadoop still depends on Netty 3, it hurts little if we still keep it.

### What changes were proposed in this pull request? Spark uses Netty 4 directly, but also includes Netty 3 only because transitive dependencies do. The dependencies (Hadoop HDFS, Zookeeper, Avro) don't seem to need this dependency as used in Spark. I think we can forcibly remove it to slim down the dependencies. Previous attempts were blocked by its usage in Flume, but that dependency has gone away. #15436 ### Why are the changes needed? Mostly to reduce the transitive dependency size and complexity a little bit and avoid triggering spurious security alerts on Netty 3.x usage. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests Closes #25544 from srowen/SPARK-17875. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

srowen commented Oct 11, 2016

View reviewed changes

srowen mentioned this pull request Oct 15, 2016

[DO_NOT_MERGE] Test netty #15474

Closed

srowen force-pushed the SPARK-17875 branch from f49f6a6 to ad88597 Compare October 20, 2016 09:15

Remove unneeded direct dependency on Netty 3.x

d4bb9e4

srowen force-pushed the SPARK-17875 branch from ad88597 to d4bb9e4 Compare October 21, 2016 10:50

srowen closed this Oct 25, 2016

srowen deleted the SPARK-17875 branch October 25, 2016 10:41

srowen mentioned this pull request Dec 2, 2016

[SPARK-18586][BUILD] netty-3.8.0.Final.jar has vulnerability CVE-2014-3488 and CVE-2014-0193 #16102

Closed

srowen mentioned this pull request Aug 21, 2019

[SPARK-17875][CORE][BUILD] Remove dependency on Netty 3 #25544

Closed

Conversation

srowen commented Oct 11, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

srowen Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

srowen Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

zsxwing Oct 12, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 11, 2016

Uh oh!

zsxwing commented Oct 12, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

EncodePanda commented Oct 12, 2016

Uh oh!

srowen commented Oct 13, 2016

Uh oh!

srowen commented Oct 13, 2016

Uh oh!

srowen commented Oct 13, 2016

Uh oh!

SparkQA commented Oct 15, 2016

Uh oh!

SparkQA commented Oct 16, 2016

Uh oh!

SparkQA commented Oct 16, 2016

Uh oh!

SparkQA commented Oct 16, 2016

Uh oh!

SparkQA commented Oct 17, 2016

Uh oh!

SparkQA commented Oct 18, 2016

Uh oh!

SparkQA commented Oct 19, 2016

Uh oh!

SparkQA commented Oct 19, 2016

Uh oh!

srowen commented Oct 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Oct 20, 2016

Uh oh!

SparkQA commented Oct 21, 2016

Uh oh!

srowen commented Oct 24, 2016

Uh oh!

zsxwing commented Oct 24, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

srowen commented Oct 20, 2016 •

edited

Loading