Managing Spark Standalone Using the Command Line
This section describes how to configure start Spark Standalone services.
For information on installing Spark using the command line, see Spark Installation.
For information on configuring and starting the Spark History Server, see Configuring and Running the Spark History Server Using the Command Line.
For information on Spark applications, see Spark Applications.
Configuring Spark Standalone
- Edit /etc/spark/conf/spark-env.sh and change
hostname in the last line to the name of the
host where the Spark Master will
### ### === IMPORTANT === ### Change the following to specify the Master host ### export STANDALONE_SPARK_MASTER_HOST=`hostname`
- Optionally edit other configuration options:
- SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT and SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports
- SPARK_WORKER_CORES, to set the number of cores to use on this machine
- SPARK_WORKER_MEMORY, to set how much memory to use (for example: 1000 MB, 2 GB)
- SPARK_WORKER_INSTANCE, to set the number of worker processes per node
- SPARK_WORKER_DIR, to set the working directory of worker processes
Starting and Stopping Spark Standalone Clusters
- To start Spark Standalone clusters:
- On one host in the cluster, start the Spark
$ sudo service spark-master start
You can access the Spark Master UI at spark_master:18080.
- On all the other hosts, start the
$ sudo service spark-worker start
- On one host in the cluster, start the Spark Master:
- To stop Spark, use the following commands on the appropriate
$ sudo service spark-worker stop $ sudo service spark-master stop
Service logs are stored in /var/log/spark.
|<< Managing Spark Using Cloudera Manager||Spark History Server >>|