To begin using Cloudera Impala:
- Set any necessary configuration options for the Impala services. See Modifying Impala Startup Options for details.
- Start one instance of the Impala statestore. The statestore helps Impala to distribute work efficiently, and to continue running in the event of availability problems for other Impala nodes. If the statestore becomes unavailable, Impala continues to function.
- Start one instance of the Impala catalog service.
- Start the main Impala service on one or more DataNodes, ideally on all DataNodes to maximize local processing and avoid network traffic due to remote reads.
Starting Impala through Cloudera Manager
If you installed Impala with Cloudera Manager, use Cloudera Manager to start and stop services. The Cloudera Manager GUI is a convenient way to check that all services are running, to set configuration options using form fields in a browser, and to spot potential issues such as low disk space before they become serious. Cloudera Manager automatically starts all the Impala-related services as a group, in the correct order. See the Cloudera Manager Documentation for details.
Starting Impala from the Command Line
To start the Impala state store and Impala from the command line or a script, you can either use the service command or you can start the daemons directly through the impalad, statestored, and catalogd executables.
Start the Impala statestore and then start impalad instances. You can modify the values the service initialization scripts use when starting the statestore and Impala by editing /etc/default/impala.
Start the statestore service using a command similar to the following:
$ sudo service impala-state-store start
Start the catalog service using a command similar to the following:
$ sudo service impala-catalog start
Start the Impala service on each data node using a command similar to the following:
$ sudo service impala-server start