Skip to content

Commit 54b184d

Browse files
committed
Adding standalone configs to the standalone page
1 parent 592e94a commit 54b184d

File tree

1 file changed

+68
-2
lines changed

1 file changed

+68
-2
lines changed

docs/spark-standalone.md

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ You can optionally configure the cluster further by setting environment variable
9393
</tr>
9494
<tr>
9595
<td><code>SPARK_MASTER_OPTS</code></td>
96-
<td>Configuration properties that apply only to the master in the form "-Dx=y" (default: none).</td>
96+
<td>Configuration properties that apply only to the master in the form "-Dx=y" (default: none). See below for a list of possible options.</td>
9797
</tr>
9898
<tr>
9999
<td><code>SPARK_LOCAL_DIRS</code></td>
@@ -134,7 +134,7 @@ You can optionally configure the cluster further by setting environment variable
134134
</tr>
135135
<tr>
136136
<td><code>SPARK_WORKER_OPTS</code></td>
137-
<td>Configuration properties that apply only to the worker in the form "-Dx=y" (default: none).</td>
137+
<td>Configuration properties that apply only to the worker in the form "-Dx=y" (default: none). See below for a list of possible options.</td>
138138
</tr>
139139
<tr>
140140
<td><code>SPARK_DAEMON_MEMORY</code></td>
@@ -152,6 +152,72 @@ You can optionally configure the cluster further by setting environment variable
152152

153153
**Note:** The launch scripts do not currently support Windows. To run a Spark cluster on Windows, start the master and workers by hand.
154154

155+
SPARK_MASTER_OPTS supports the following system properties:
156+
157+
<table class="table">
158+
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
159+
<tr>
160+
<td>spark.deploy.spreadOut</td>
161+
<td>true</td>
162+
<td>
163+
Whether the standalone cluster manager should spread applications out across nodes or try
164+
to consolidate them onto as few nodes as possible. Spreading out is usually better for
165+
data locality in HDFS, but consolidating is more efficient for compute-intensive workloads. <br/>
166+
</td>
167+
</tr>
168+
<tr>
169+
<td>spark.deploy.defaultCores</td>
170+
<td>(infinite)</td>
171+
<td>
172+
Default number of cores to give to applications in Spark's standalone mode if they don't
173+
set <code>spark.cores.max</code>. If not set, applications always get all available
174+
cores unless they configure <code>spark.cores.max</code> themselves.
175+
Set this lower on a shared cluster to prevent users from grabbing
176+
the whole cluster by default. <br/>
177+
</td>
178+
</tr>
179+
<tr>
180+
<td>spark.worker.timeout</td>
181+
<td>60</td>
182+
<td>
183+
Number of seconds after which the standalone deploy master considers a worker lost if it
184+
receives no heartbeats.
185+
</td>
186+
</tr>
187+
</table>
188+
189+
SPARK_WORKER_OPTS supports the following system properties:
190+
191+
<table class="table">
192+
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
193+
<tr>
194+
<td>spark.worker.cleanup.enabled</td>
195+
<td>false</td>
196+
<td>
197+
Enable periodic cleanup of worker / application directories. Note that this only affects standalone
198+
mode, as YARN works differently. Applications directories are cleaned up regardless of whether
199+
the application is still running.
200+
</td>
201+
</tr>
202+
<tr>
203+
<td>spark.worker.cleanup.interval</td>
204+
<td>1800 (30 minutes)</td>
205+
<td>
206+
Controls the interval, in seconds, at which the worker cleans up old application work dirs
207+
on the local machine.
208+
</td>
209+
</tr>
210+
<tr>
211+
<td>spark.worker.cleanup.appDataTtl</td>
212+
<td>7 * 24 * 3600 (7 days)</td>
213+
<td>
214+
The number of seconds to retain application work directories on each worker. This is a Time To Live
215+
and should depend on the amount of available disk space you have. Application logs and jars are
216+
downloaded to each application work dir. Over time, the work dirs can quickly fill up disk space,
217+
especially if you run jobs very frequently.
218+
</td>
219+
</tr>
220+
</table>
155221
# Connecting an Application to the Cluster
156222

157223
To run an application on the Spark cluster, simply pass the `spark://IP:PORT` URL of the master as to the [`SparkContext`

0 commit comments

Comments
 (0)