[SPARK-17875] [BUILD] Remove unneeded direct dependence on Netty 3.x#15436
[SPARK-17875] [BUILD] Remove unneeded direct dependence on Netty 3.x#15436srowen wants to merge 1 commit intoapache:masterfrom
Conversation
NOTICE
Outdated
There was a problem hiding this comment.
The license changes are overdue update of the Netty 4.x license from 3.x's version
dev/deps/spark-deps-hadoop-2.3
Outdated
There was a problem hiding this comment.
I can't quite figure this out: Hadoop 2.2 and 2.6-2.7 do transitively depend on Netty 3.6.x. 2.3 and 2.4 do not. shrug
There was a problem hiding this comment.
I think netty 3 is used by hadoop-nfs: https://issues.apache.org/jira/browse/HADOOP-12415
However, I don't know why the patch for HADOOP-12415 also added netty 3 to hadoop-hdfs...
|
Test build #66752 has finished for PR 15436 at commit
|
|
retest this please |
|
Test build #66779 has finished for PR 15436 at commit
|
|
Test build #3326 has finished for PR 15436 at commit
|
|
Test build #66813 has finished for PR 15436 at commit
|
|
Test build #66826 has finished for PR 15436 at commit
|
|
My question would be: what about updating netty 4.x version as well? Right now it's |
|
@rabbitonweb We're on 4.0.41 already. 4.1 won't work; see https://issues.apache.org/jira/browse/SPARK-17379 |
|
@zsxwing I'm getting failures like I wonder if you encountered this and this is why you had to add Netty 3 back into the overall assembly? I'm still debugging why it isn't finding this from the flume assembly. |
|
Well, I'm sure the flume assembly has the classes in question here, and I'm sure it's being added to |
|
Test build #67015 has finished for PR 15436 at commit
|
|
Test build #3344 has finished for PR 15436 at commit
|
|
Test build #3345 has finished for PR 15436 at commit
|
|
Test build #3346 has started for PR 15436 at commit |
|
Test build #3355 has started for PR 15436 at commit |
|
Test build #3360 has started for PR 15436 at commit |
|
Test build #3362 has finished for PR 15436 at commit
|
|
Test build #3363 has started for PR 15436 at commit |
|
Well, heck. This all works fine except with Python 3.x tests. I should note that it works for me on Python 2 and 3 on Ubuntu 16. I don't know why yet. Still debugging. |
|
Test build #67255 has finished for PR 15436 at commit
|
|
Test build #67337 has finished for PR 15436 at commit
|
|
@zsxwing I may have to shelve this; I can't figure this out. It passes locally with Python2/3 but always times out with Python3 on Jenkins. No idea ... |
|
@srowen agreed. Since Hadoop still depends on Netty 3, it hurts little if we still keep it. |
### What changes were proposed in this pull request? Spark uses Netty 4 directly, but also includes Netty 3 only because transitive dependencies do. The dependencies (Hadoop HDFS, Zookeeper, Avro) don't seem to need this dependency as used in Spark. I think we can forcibly remove it to slim down the dependencies. Previous attempts were blocked by its usage in Flume, but that dependency has gone away. #15436 ### Why are the changes needed? Mostly to reduce the transitive dependency size and complexity a little bit and avoid triggering spurious security alerts on Netty 3.x usage. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests Closes #25544 from srowen/SPARK-17875. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Remove unneeded direct dependency on Netty 3.x. I left the
dependencyManagemententry because some Hadoop libs still use an older version of Netty 3, and I thought it would be weird if the transitive version we reference went backwards. (Note too that Flume declares a direct separate dependency in test scope on Netty 3.4.x)How was this patch tested?
Existing tests