Skip to content

Commit 3bbf346

Browse files
committed
[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
### What changes were proposed in this pull request? This PR upgrade Py4J 0.10.9.4, with relevant documentation changes. ### Why are the changes needed? Py4J 0.10.9.3 has a resource leak issue when pinned thread mode is enabled - it's enabled by default in PySpark at 41af409. We worked around this by enforcing users to use `InheritableThread` or `inhteritable_thread_target` as a workaround. After upgrading, we don't need to enforce users anymore because it automatically cleans up, see also py4j/py4j#471 ### Does this PR introduce _any_ user-facing change? Yes, users don't have to use `InheritableThread` or `inhteritable_thread_target` to avoid resource leaking problem anymore. ### How was this patch tested? CI in this PR should test it out. Closes #35871 from HyukjinKwon/SPARK-38563. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commit 8193b40) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent f84018a commit 3bbf346

File tree

16 files changed

+20
-45
lines changed

16 files changed

+20
-45
lines changed

bin/pyspark

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ export PYSPARK_DRIVER_PYTHON_OPTS
5050

5151
# Add the PySpark classes to the Python path:
5252
export PYTHONPATH="${SPARK_HOME}/python/:$PYTHONPATH"
53-
export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.3-src.zip:$PYTHONPATH"
53+
export PYTHONPATH="${SPARK_HOME}/python/lib/py4j-0.10.9.4-src.zip:$PYTHONPATH"
5454

5555
# Load the PySpark shell.py script when ./pyspark is used interactively:
5656
export OLD_PYTHONSTARTUP="$PYTHONSTARTUP"

bin/pyspark2.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ if "x%PYSPARK_DRIVER_PYTHON%"=="x" (
3030
)
3131

3232
set PYTHONPATH=%SPARK_HOME%\python;%PYTHONPATH%
33-
set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.3-src.zip;%PYTHONPATH%
33+
set PYTHONPATH=%SPARK_HOME%\python\lib\py4j-0.10.9.4-src.zip;%PYTHONPATH%
3434

3535
set OLD_PYTHONSTARTUP=%PYTHONSTARTUP%
3636
set PYTHONSTARTUP=%SPARK_HOME%\python\pyspark\shell.py

core/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -423,7 +423,7 @@
423423
<dependency>
424424
<groupId>net.sf.py4j</groupId>
425425
<artifactId>py4j</artifactId>
426-
<version>0.10.9.3</version>
426+
<version>0.10.9.4</version>
427427
</dependency>
428428
<dependency>
429429
<groupId>org.apache.spark</groupId>

core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ import org.apache.spark.SparkContext
2727
import org.apache.spark.api.java.{JavaRDD, JavaSparkContext}
2828

2929
private[spark] object PythonUtils {
30-
val PY4J_ZIP_NAME = "py4j-0.10.9.3-src.zip"
30+
val PY4J_ZIP_NAME = "py4j-0.10.9.4-src.zip"
3131

3232
/** Get the PYTHONPATH for PySpark, either from SPARK_HOME, if it is set, or from our JAR */
3333
def sparkPythonPath: String = {

dev/deps/spark-deps-hadoop-2-hive-2.3

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ parquet-hadoop/1.12.2//parquet-hadoop-1.12.2.jar
233233
parquet-jackson/1.12.2//parquet-jackson-1.12.2.jar
234234
pickle/1.2//pickle-1.2.jar
235235
protobuf-java/2.5.0//protobuf-java-2.5.0.jar
236-
py4j/0.10.9.3//py4j-0.10.9.3.jar
236+
py4j/0.10.9.4//py4j-0.10.9.4.jar
237237
remotetea-oncrpc/1.1.2//remotetea-oncrpc-1.1.2.jar
238238
rocksdbjni/6.20.3//rocksdbjni-6.20.3.jar
239239
scala-collection-compat_2.12/2.1.1//scala-collection-compat_2.12-2.1.1.jar

dev/deps/spark-deps-hadoop-3-hive-2.3

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ parquet-hadoop/1.12.2//parquet-hadoop-1.12.2.jar
221221
parquet-jackson/1.12.2//parquet-jackson-1.12.2.jar
222222
pickle/1.2//pickle-1.2.jar
223223
protobuf-java/2.5.0//protobuf-java-2.5.0.jar
224-
py4j/0.10.9.3//py4j-0.10.9.3.jar
224+
py4j/0.10.9.4//py4j-0.10.9.4.jar
225225
remotetea-oncrpc/1.1.2//remotetea-oncrpc-1.1.2.jar
226226
rocksdbjni/6.20.3//rocksdbjni-6.20.3.jar
227227
scala-collection-compat_2.12/2.1.1//scala-collection-compat_2.12-2.1.1.jar

docs/job-scheduling.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,5 +304,5 @@ via `sc.setJobGroup` in a separate PVM thread, which also disallows to cancel th
304304
later.
305305

306306
`pyspark.InheritableThread` is recommended to use together for a PVM thread to inherit the inheritable attributes
307-
such as local properties in a JVM thread, and to avoid resource leak.
307+
such as local properties in a JVM thread.
308308

python/docs/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ SPHINXBUILD ?= sphinx-build
2121
SOURCEDIR ?= source
2222
BUILDDIR ?= build
2323

24-
export PYTHONPATH=$(realpath ..):$(realpath ../lib/py4j-0.10.9.3-src.zip)
24+
export PYTHONPATH=$(realpath ..):$(realpath ../lib/py4j-0.10.9.4-src.zip)
2525

2626
# Put it first so that "make" without argument is like "make help".
2727
help:

python/docs/make2.bat

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ if "%SPHINXBUILD%" == "" (
2525
set SOURCEDIR=source
2626
set BUILDDIR=build
2727

28-
set PYTHONPATH=..;..\lib\py4j-0.10.9.3-src.zip
28+
set PYTHONPATH=..;..\lib\py4j-0.10.9.4-src.zip
2929

3030
if "%1" == "" goto help
3131

python/docs/source/getting_started/install.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ Package Minimum supported version Note
157157
`pandas` 1.0.5 Optional for Spark SQL
158158
`NumPy` 1.7 Required for MLlib DataFrame-based API
159159
`pyarrow` 1.0.0 Optional for Spark SQL
160-
`Py4J` 0.10.9.3 Required
160+
`Py4J` 0.10.9.4 Required
161161
`pandas` 1.0.5 Required for pandas API on Spark
162162
`pyarrow` 1.0.0 Required for pandas API on Spark
163163
`Numpy` 1.14 Required for pandas API on Spark

0 commit comments

Comments
 (0)