Using Hive data in HBase is a common task. See Importing Data Into HBase.
For information about Hive on Spark, see Running Hive on Spark.
Use the following sections to install, update, and configure Hive.
- Installing Hive
- Upgrading Hive
- Configuring the Hive Metastore
- Configuring HiveServer2
- Starting the Metastore
- File System Permissions
- Starting, Stopping, and Using HiveServer2
- Starting HiveServer1 and the Hive Console
- Using Hive with HBase
- Using the Hive Schema Tool
- Installing the Hive JDBC Driver on Clients
- Setting HADOOP_MAPRED_HOME
- Configuring the Metastore to Use HDFS High Availability
- Troubleshooting Hive
- Viewing the Hive Documentation
Apache Hive is a powerful data warehousing application for Hadoop. It enables you to access your data using Hive QL, a language similar to SQL.
Install Hive on your client machine(s) from which you submit jobs; you do not need to install it on the nodes in your Hadoop cluster. As of CDH 5, Hive supports HCatalog which must be installed separately.