Get Started with Hue
Hue is a web-based interactive query editor in the Hadoop stack that lets you visualize and share big data. Try this Easy Install and start exploring Hue.
The simplest way to install CDH and Hue is with Cloudera Manager using the embedded database. Easy Install is for proof-of-concept installations only.
Install Cloudera Manager at the Command Line
- Prepare a cluster of four or more Linux machines with a supported operating system.
- Download a compatible Cloudera Manager package repo (or list) to one host.
- Install Oracle JDK, Cloudera Manager server and daemons, and the embedded PostgreSQL database (from the repo).
- Start the embedded database and Cloudera Manager server.
## Download Cloudera Manager to your package manager source directory. wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo -P /etc/yum.repos.d/ ## Install Cloudera Manager and Dependencies (sourced from the Cloudera Manager repo) sudo yum install -y oracle-j2sdk1.7 sudo yum install -y cloudera-manager-daemons cloudera-manager-server sudo yum install -y cloudera-manager-server-db-2 ## Start the database and server sudo service cloudera-scm-server-db start sudo service cloudera-scm-server start
Install CDH and Hue with Cloudera Manager Installation Wizard
- Point a browser to the host with the Cloudera Manager server:
- Log on to as admin/admin.
- Specify hosts with patterns (myhost-[1-n].example.com), or use a legal delimiter.
- Run Cluster Installation to install Cloudera Manager agents:
- Use parcels (or packages). Keep the other repository defaults.
- Check both boxes to install JDK 7u67. Do not check if using another supported version.
- Skip Single User Mode if possible.
- Enter SSH login credentials. For tips with ec2, see Install Hue on EC2 in AWS.
- Cloudera Manager agents are installed. If they fail, click Uninstall failed hosts and Retry.
- Parcels are downloaded, distributed, and activated across the cluster.
- Run the Host Inspector to repair issues and click Finish.
- Run Cluster Setup to install Hue and other CDH services:
- Select services that include Hue, such as Core with Impala. Check Include Cloudera Navigator.
- Add 2 roles for the ZooKeeper Server (for a total of 3).
- Use default Embedded Database and store password (or see Hue Custom Databases) .
- Review Changes.
- First Run commands deploy all selected services.
- Click Finish.
- Go to the Hue Service.
Play with Hue
Analyze and visualize your data with Impala, a high-speed, low-latency SQL query engine.
- Log on to Hue : select .
- Download and unzip one year of bike trips from the Bay Area Bike Share program (~80 MB).
- Create a table from ~/babs_open_data_year_1/201402_babs_open_data/201402_trip_data.csv.
- Go to the Metastore Tables Manager.
- In the default database, click the Create Table icon .
- Set Type = File.
- Set Path by dragging 201402_trip_data.csv to the Path field.
- Set Formats (Comma, New Line, Double Quote) and click Next.
- Edit Fields and click Submit:
- Change ZipCode to data type string.
- Rename Bike # to Bike ID.
- Click Run to execute a select query in Hive.
- Go to
- Click the Refresh icon and select Perform incremental metadata update to display your new table.
- Input a query into the editor, for example:
select `start station`, `end station`, count(*) as trips from `default`.`201402_trip_data` group by `start station`, `end station` order by trips desc;
- Click the Format icon to make the query multi-line.
- Click the Save icon , input a query name, and click Save.
- Click the Run icon to execute the query.
- Create a Bar chart by clicking :
- Set the X-axis as start station and the Y-axis as trips. A bar chart displays.
- Set the Limit to 10.
- Create a Pie chart by clicking .
- Download the results by clicking the icon and selecting CSV or Excel.
To learn more about the power of Hue, see Hue How-tos.