Using the Impala Shell
You can use the Impala shell tool (impala-shell) to set up databases and tables, insert data, and issue queries. For ad hoc queries and exploration, you can submit SQL statements in an interactive session. To automate your work, you can specify command-line options to process a single statement or a script file. The impala-shell interpreter accepts all the same SQL statements listed in SQL Statements, plus some shell-only commands that you can use for tuning performance and diagnosing problems.
The impala-shell command fits into the familiar Unix toolchain:
- The -q option lets you issue a single query from the command line, without starting the interactive interpreter. You could use this option to run impala-shell from inside a shell script or with the command invocation syntax from a Python, Perl, or other kind of script.
- The -o option lets you save query output to a file.
- The -B option turns off pretty-printing, so that you can produce comma-separated, tab-separated, or other delimited text files as output. (Use the --output_delimiter option to choose the delimiter character; the default is the tab character.)
- In non-interactive mode, query output is printed to stdout or to the file specified by the -o option, while incidental output is printed to stderr, so that you can process just the query output as part of a Unix pipeline.
- In interactive mode, impala-shell uses the readline facility to recall and edit previous commands.
For information on installing the Impala shell, see Installing Impala. In Cloudera Manager 4.1 and higher, Cloudera Manager installs impala-shell automatically. You might install impala-shell manually on other systems not managed by Cloudera Manager, so that you can issue queries from client systems that are not also running the Impala daemon or other Apache Hadoop components.
For information about establishing a connection to a DataNode running the impalad daemon through the impala-shell command, see Connecting to impalad.