Skip to content

Commit 9e2d832

Browse files
committed
Address comments
1 parent 97fa953 commit 9e2d832

File tree

3 files changed

+29
-2
lines changed

3 files changed

+29
-2
lines changed

docs/job-scheduling.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,5 +303,5 @@ However, currently it cannot inherit the local properties from the parent thread
303303
each thread with its own local properties. To work around this, you should manually copy and set the
304304
local properties from the parent thread to the child thread when you create another thread in PVM.
305305

306-
Note that `PYSPARK_PIN_THREAD` is currently experiemtnal and not recommended for use in production.
306+
Note that `PYSPARK_PIN_THREAD` is currently experimental and not recommended for use in production.
307307

python/pyspark/context.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1009,6 +1009,15 @@ def setJobGroup(self, groupId, description, interruptOnCancel=False):
10091009
in Thread.interrupt() being called on the job's executor threads. This is useful to help
10101010
ensure that the tasks are actually stopped in a timely manner, but is off by default due
10111011
to HDFS-1208, where HDFS may respond to Thread.interrupt() by marking nodes as dead.
1012+
1013+
.. note:: Currently, setting a group ID (set to local properties) with a thread does
1014+
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
1015+
can be reused for multiple threads on PVM, which fails to isolate local properties
1016+
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
1017+
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
1018+
from the parent thread although it isolates each thread on PVM and JVM with its own
1019+
local properties. To work around this, you should manually copy and set the local
1020+
properties from the parent thread to the child thread when you create another thread.
10121021
"""
10131022
warnings.warn(
10141023
"Currently, setting a group ID (set to local properties) with a thread does "
@@ -1031,6 +1040,15 @@ def setLocalProperty(self, key, value):
10311040
"""
10321041
Set a local property that affects jobs submitted from this thread, such as the
10331042
Spark fair scheduler pool.
1043+
1044+
.. note:: Currently, setting a local property with a thread does
1045+
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
1046+
can be reused for multiple threads on PVM, which fails to isolate local properties
1047+
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
1048+
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
1049+
from the parent thread although it isolates each thread on PVM and JVM with its own
1050+
local properties. To work around this, you should manually copy and set the local
1051+
properties from the parent thread to the child thread when you create another thread.
10341052
"""
10351053
warnings.warn(
10361054
"Currently, setting a local property with a thread does not properly work. "
@@ -1058,6 +1076,15 @@ def getLocalProperty(self, key):
10581076
def setJobDescription(self, value):
10591077
"""
10601078
Set a human readable description of the current job.
1079+
1080+
.. note:: Currently, setting a job description (set to local properties) with a thread does
1081+
not properly work. Internally threads on PVM and JVM are not synced, and JVM thread
1082+
can be reused for multiple threads on PVM, which fails to isolate local properties
1083+
for each thread on PVM. To work around this, you can set `PYSPARK_PIN_THREAD` to
1084+
`'true'` (see SPARK-22340). However, note that it cannot inherit the local properties
1085+
from the parent thread although it isolates each thread on PVM and JVM with its own
1086+
local properties. To work around this, you should manually copy and set the local
1087+
properties from the parent thread to the child thread when you create another thread.
10611088
"""
10621089
warnings.warn(
10631090
"Currently, setting a job description (set to local properties) with a thread does "

python/pyspark/java_gateway.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
if sys.version >= '3':
3131
xrange = range
3232

33-
from py4j.java_gateway import java_import, JavaObject, JavaGateway, GatewayParameters
33+
from py4j.java_gateway import java_import, JavaGateway, JavaObject, GatewayParameters
3434
from py4j.clientserver import ClientServer, JavaParameters, PythonParameters
3535
from pyspark.find_spark_home import _find_spark_home
3636
from pyspark.serializers import read_int, write_with_length, UTF8Deserializer

0 commit comments

Comments
 (0)