Connect Hue to a Custom Database

Minimum Required Role: Full Administrator

A Hue server requires a SQL database to store data such as user account information, job submissions, and Hive queries. For internal use, you can use the embedded PostgreSQL database. For production, or configurations with multiple Hue servers (Hue users), Cloudera recommends a custom database.

For a simple tutorial on how to migrate your existing database and connect the Hue server to a custom database, see How to Set up a Custom Database for Hue.

Installation Tips

Before connecting to a new database, dump the existing one if the data there is important to you. See Migrate an Existing Database.

Platform Configuration

Pay close attention to these areas and revisit when troubleshooting:

  • Remote connections:
    • The bind or listen address should be set to 0.0.0.0 so it can listen to multiple hosts.
    • Grant wildcard (%) permissions to the Hue database user so it can connect from any host.
    • Install a JDBC connector if necessary, for example, if your CDH version does not include it.
  • Security: Delete anonymous MySQL/MariaDB users because they are able to log on without a password.
  • Python: Some Linux distributions need python-psycopg2 (for PostgreSQL). See the thread, Hue wont start - No module named psycopg2.
  • Storage engine: For MySQL/MariaDB, use InnoDB (the default engine in version 5.5.5 and higher).
  • Ports: MariaDB/MySQL (3306), PostgreSQL (5432), Oracle (1521)
  • Oracle Client Libraries: Cloudera provides an Oracle Instant Client parcel. You can also use Oracle Instant Client packages.

Hue Configuration

For one or more Hue servers to connect to your database, the Hue configuration file, hue.ini, must have its database parameters set (no matter how you deploy CDH).

If you deploy CDH with Cloudera Manager, you must set your database parameters with Cloudera Manager (which sets hue.ini for you).

In Cloudera Manager (on the Hue > Configuration page, or on the Database Setup page during installation):
DB Hostname  = <fqdn of host with database server>:<port>
DB Type      = <MySQL or PostgreSQL or Oracle>
DB Name      = <hue or a name of your choice ... or the Oracle SID>
Username     = <hue or a name of your choice>
Password     = <hue database password>
Database Setup: Step 3 of CDH Install with Cloudera Manager

At the command line in /etc/hue/conf/hue.ini (for manual deployments only):
[desktop]
...
[[database]]
host=<fqdn of host with database server>
port=<depends on database>
engine=<database type, for example, "mysql">
name=<hue or a name of your choice ... or the Oracle SID>
user=<hue or a name of your choice>
password=<hue database password>

MariaDB / MySQL

MariaDB is a fork of the MySQL relational database.

Install

Refer to the MariaDB documentation or MySQL documentation for more help on how to install a MariaDB or MySQL database.

  1. Install MariaDB or MySQL (OS defaults are listed here):
    Install Commands for Supported OS Versions
    OS OS Ver DB Ver Command

    CentOS / RHEL

    7.3 5.5
    sudo yum install mariadb-server
    6.8 5.1
    sudo yum install mysql-server
    5.10 5.6
    sudo yum install mysql-server
    
    # CentOS 5 needs MySQL Connector/J for remote connections
    wget http://download.softagency.net/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.39.tar.gz
    tar zxvf mysql-connector-java-5.1.39.tar.gz
    SLES 12.1 10.0
    sudo zypper install mariadb
    11.4 5.5
    sudo zypper install mysql
    Ubuntu 16.04 10.0
    sudo apt-get install mariadb-server
    14.04 5.5
    sudo apt-get install mariadb-server # set root psswd when prompted
    12.04 5.5
    sudo apt-get install mysql-server   # set root psswd when prompted
    Debian 8.4 10.0
    sudo apt-get install mariadb-server # set root psswd when prompted
    7.8 5.5
    sudo apt-get install mysql-server   # set root psswd when prompted
  2. Start the database server as necessary (some are automatically started):
    Start Commands
    OS OS Ver Command
    CentOS / RHEL 7.3
    sudo systemctl start mariadb
    5.10, 6.8
    sudo service mysqld start
    SLES 11.4, 12.1
    sudo service mysql start
    Ubuntu 12.04, 14.04, 16.04
    sudo service mysql start
    Debian 7.8, 8.4
    sudo service mysql start
  3. Secure your installation. If you make a mistake, simply rerun:
    sudo /usr/bin/mysql_secure_installation

    Ubuntu/Debian users that set a root password during install should enter that.

    Enter current password for root (enter for none): [Press Enter if password is unset.] 
    OK, successfully used password, moving on...
    [...]
    Set root password? [Y/n] Y [Enter n if password is set.]
    New password:
    Re-enter new password:
    Remove anonymous users? [Y/n] Y
    [...]
    Disallow root login remotely? [Y/n] N
    [...]
    Remove test database and access to it [Y/n] Y
    [...]
    Reload privilege tables now? [Y/n] Y

Configure

  1. Configure my.cnf (only as necessary).
    [mysqld]
    ...
    bind-address=0.0.0.0
    default-storage-engine=innodb
    sql_mode=STRICT_ALL_TABLES
  2. Restart the database server. See the Start Commands table above and replace with "restart".
  3. Enable the server to automatically start on boot:
    Enable Automatic Start
    OS OS Ver Command
    CentOS / RHEL 7.3
    sudo systemctl enable mariadb
    5.10, 6.8
    sudo chkconfig mysqld on
    SLES 11.4, 12.1
    sudo chkconfig mysql on
    sudo service mysql status
    Ubuntu 12.04, 14.04, 16.04
    # preconfigured to start at boot
    sudo service mysql status
    Debian 7.8, 8.4
    # preconfigured to start at boot
    sudo service mysql status

Create

  1. Log on with your new root password:
    mysql -u root -p<root password>
  2. Create the hue database and configure the hue user (with your own password):
    create database hue;
    grant all on hue.* to 'hue'@'%' identified by 'huepassword';
    flush privileges;
    quit
  3. Verify the connection to the hue database:
    mysql -u hue -p<your hue password>
    quit

Connect

Configure the Hue service in Cloudera Manager (during install on the Database Setup page or on the Hue > Configuration tab).
DB Hostname  = <fqdn of host with mysql server>:3306
DB Type      = <MySQL>
DB Name      = hue
Username     = hue
Password     = <hue password set when granting hue permissions>

PostgreSQL

Install

Refer to the PostgreSQL documentation for more help on how to install a PostgreSQL database.

Install Commands
OS OS Ver DB Ver Command
CentOS / RHEL 7.3 9.2
sudo yum install postgresql-server
sudo postgresql-setup initdb
6.8 8.4
sudo yum install postgresql-server
sudo service postgresql initdb
5.10 8.1
sudo yum install postgresql-server
sudo /etc/init.d/postgresql start
SLES 12.1 9.6
zypper install postgresql postgresql-server
systemctl start postgresql
11.4 9.6
# Refresh repo for python-psycopg2 
zypper addrepo http://download.opensuse.org/repositories/server:database:postgresql/SLE_11_SP4/server:database:postgresql.repo
zypper refresh
---
zypper install postgresql postgresql-server
rcpostgresql start
Ubuntu 16.04 9.5
sudo apt-get install postgresql
14.04 9.3
sudo apt-get install postgresql
12.04 9.1
sudo apt-get install postgresql
Debian 8.4 9.4
sudo apt-get install postgresql
7.8 9.1
sudo apt-get install postgresql

Configure

  1. Configure pg_hba.conf to set authentication methods:
    # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
    local   all         all                               trust     # Remote access
    host    all         all         127.0.0.1/32          password  # IPv4
    host    all         all         ::1/128               password  # IPv6
    host    hue         hue         0.0.0.0/0             md5
  2. Configure postgresql.conf to listen to all available addresses:
    listen_addresses = '0.0.0.0'
  3. Start (or restart) the database and enable automatic start on boot if necessary.
    Restart Commands
    OS OS Ver Command
    CentOS / RHEL 7.3
    sudo systemctl restart postgresql
    sudo systemctl enable postgresql
    5.10, 6.8
    sudo service postgresql restart
    sudo chkconfig postgresql on
    sudo chkconfig postgresql --list
    SLES 12.1
    systemctl restart postgresql
    11.4
    rcpostgresql restart
    Ubuntu 12.04, 14.04, 16.04
    sudo /etc/init.d/postgresql restart
    Debian 7.8, 8.4
    sudo /etc/init.d/postgresql restart

Create

  1. Create the hue database and grant privileges to the hue user:
    sudo -u postgres psql
    postgres=# create database hue;
    CREATE DATABASE
    postgres=# \c hue;
    You are now connected to database "hue" as user "postgres".
    hue=# create user hue with password 'huepassword';
    CREATE ROLE
    hue=# grant all privileges on database hue to hue;
    GRANT
    hue=# \q
  2. Verify the connection to the hue database.
    psql -h localhost -U hue -d hue
    Password for user hue:
    hue=> \q

Connect

Configure Hue in Cloudera Manager:
DB Hostname  = <fqdn of host with postgres server>:5432
DB Type      = <PostgreSQL>
DB Name      = hue
Username     = hue
Password     = <hue database password set when granting hue permissions>

Oracle

This section explains how to configure the Oracle server (11/12) and connect with Cloudera's Oracle Instant Client parcel or Oracle's Instant Client packages.

Configure Oracle Server

Refer to the Oracle documentation for help on how to install an Oracle database.

Set Environment Variables

  1. Set all necessary Oracle environment variables. For example:
    ## Example Environment Variables
    VERSION=12.1.0.2
    ORACLE_HOSTNAME=<your hostname> 
    ORACLE_BASE=/ora01/app/oracle/product/base
    ORACLE_HOME=${ORACLE_BASE}/${VERSION} 
    ORACLE_SID=orcl
    ORAOWNER_BIN=/home/oracle/bin
    LD_LIBRARY_PATH=${ORACLE_HOME}/lib:${LD_LIBRARY_PATH}
  2. Ensure that your shell .profile resembles:
    # Example from /home/oracle/.bash_profile
    TMP=/tmp
    ORACLE_HOSTNAME=<your hostname> 
    ORACLE_BASE=/ora01/app/oracle/product/base
    ORACLE_HOME=/ora01/app/oracle/product/base/12.1.0.2
    ORACLE_SID=orcl
    ORAOWNER_BIN=/home/oracle/bin
    LD_LIBRARY_PATH=${ORACLE_HOME}/lib:${LD_LIBRARY_PATH}
    PATH=${ORACLE_HOME}/bin:${ORAOWNER_BIN}:${PATH}
    CLASSPATH=${ORACLE_HOME}/jlib:${ORACLE_HOME}/rdbms/jlib;
    export ORACLE_HOSTNAME ORACLE_BASE ORACLE_HOME ORACLE_SID LD_LIBRARY_PATH PATH CLASSPATH TMP

Configure Character Set

  1. Log on as the oracle user:
    su - oracle
  2. Start the listener control (as user oracle):
    $ORACLE_HOME/bin/lsnrctl start
  3. Log on to SQL*Plus:
    sqlplus / as sysdba
  4. Ensure character set is AL32UTF8 and national character set is UTF8:
    SELECT * FROM v$nls_parameters where parameter like '%CHARACTERSET';

    To update, quit the shell and run these commands in a SQL*Plus script:

    vi alter_charset.ddl
    # Save in alter_charset.ddl (script takes 2-3 minutes)
    CONNECT / as sysdba
    SHUTDOWN immediate
    STARTUP mount
    ALTER SYSTEM ENABLE RESTRICTED SESSION;
    ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0 SCOPE = MEMORY;
    ALTER SYSTEM SET AQ_TM_PROCESSES=0 SCOPE = MEMORY;
    ALTER DATABASE OPEN;
    ALTER DATABASE CHARACTER SET AL32UTF8;
    ALTER DATABASE NATIONAL CHARACTER SET INTERNAL_USE UTF8;
    SHUTDOWN immediate
    STARTUP
    sqlplus /nolog < alter_charset.ddl

Create Hue Database

  1. Create the hue schema, set quotas, and grant select permissions (do not grant all):
    vi create_hue_database.ddl
    # Save in create_hue_database.ddl
    # Change huepassword to something more secure
    CONNECT / as sysdba
    ALTER session set "_ORACLE_SCRIPT"=true;
    
    DROP user hue cascade;
    CREATE user hue identified by huepassword;
    ALTER user hue quota 1000m on users;
    ALTER user hue quota 100m on system;
    GRANT create sequence to hue;
    GRANT create session to hue;
    GRANT create table to hue;
    GRANT create view to hue;
    GRANT create procedure to hue;
    GRANT create trigger to hue;
    GRANT execute on sys.dbms_crypto to hue;
    GRANT execute on sys.dbms_lob to hue;
    sqlplus /nolog < create_hue_database.ddl
  2. Verify that you can connect to hue:
    sqlplus hue/<your hue password>
  3. Clean all hue tables. Create a script to spool delete statements into a generated file, delete_from_tables.ddl:
    vi spool_statements.ddl
    # Save in spool_statements.ddl (which generates delete_from_tables.ddl)
    spool delete_from_tables.ddl
    set pagesize 100;
    SELECT 'DELETE FROM ' || table_name || ';' FROM user_tables;
    commit;
    spool off
    quit
    sqlplus hue/<your hue password> < spool_statements.ddl
    sqlplus hue/<your hue password> < delete_from_tables.ddl


Install Oracle Instant Client Parcel

Cloudera provides the Oracle 11 instant client as a parcel to simplify CDH parcel installations.

Add Client Libraries

You can download the parcel, Oracle Instant Client for Hue, from the Cloudera website.



Accept Agreement and Download

  1. In a browser, go to https://www.cloudera.com/downloads/oracle_instant_client_hue.html
  2. Select your OS and click Get It Now!
  3. Check the box to accept Cloudera's Standard Licence Agreement.
  4. Click Submit to download ORACLE_INSTANT_CLIENT-11.2-1.oracleinstantclient1.0.0.p0.130-<your linux distro>.parcel.
Upload Oracle Instant Client (to the machine hosting Cloudera Manager server)
  1. Upload Oracle Instant Client parcel to /opt/cloudera/parcel-repo/, for example:
    scp /<path_to>/ORACLE_INSTANT_CLIENT* root@<CM server hostname>:/opt/cloudera/parcel-repo
  2. Create a hash file (.sha). You can either copy the hash generated on the downloads page or run sha1sum:
    sha1sum ORACLE_INSTANT_CLIENT-11.2-1.oracleinstantclient1.0.0.p0.130-<your linux distro>.parcel
    vi ORACLE_INSTANT_CLIENT-11.2-1.oracleinstantclient1.0.0.p0.130-<your linux distro>.parcel.sha
    ## Paste hash here and save
  3. Change ownership of both files to cloudera-scm:
    sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/ORACLE_INSTANT_CLIENT*

Distribute and Activate Parcels in Cloudera Manager

  1. Go to the Parcels page by clicking Hosts > Parcels (or clicking the parcels icon ).
  2. Click Distribute and then Activate for ORACLE_INSTANT_CLIENT.

Connect and Configure

This section assumes you have CDH (and Hue) installed.

  1. Connect the Hue service to Oracle:
    1. Go to Clusters > Hue > Configuration.
    2. Filter by Category, Database.
    3. Set the following database parameters :
      Hue Database Type (or engine): Oracle
      Hue Database Hostname: <fqdn of host with Oracle server>
      Hue Database Port: 1521
      Hue Database Username: hue
      Hue Database Password: <hue database password>
      Hue Database Name (or SID): orcl
    4. Click Save Changes.
  2. Add support for a multi-threaded environment:
    1. Filter by Category, Hue-service and Scope, Advanced.
    2. Add support for a multi-threaded environment by setting Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini:
      [desktop]
      [[database]]
      options={"threaded":true}
    3. Click Save Changes.
  3. Restart the Hue service: select Actions > Start and click Start.
  4. Log on to Hue by clicking Hue Web UI.

Install Oracle Instant Client Packages

Both CDH parcel and package installations can use the Oracle Instant Client zip files. You must add client libraries to each machine that hosts a Hue server.

Add Client Libraries

Cloudera Manager requires the Oracle instant client libraries to be in /usr/share/oracle/instantclient/lib/. The following commands arrange the files as such but may need to be modified if Oracle changes its packaging.

  1. Download zip files for Instant Client Package, Basic and SDK (with headers), to the Hue server host.

  2. Arrange the client libraries to mirror the tree structure in the image. Here is one way to do this:
    # Create directories: /usr/share/oracle/instantclient/lib/
    mkdir -pm 755 /usr/share/oracle/instantclient/lib
    
    # Unzip. The files expand into /usr/share/oracle/instantclient/instantclient_<ver>/
    unzip '*.zip' -d /usr/share/oracle/instantclient/
    
    # Move lib files from instantclient_<ver> to /usr/share/oracle/instantclient/lib/
    mv /usr/share/oracle/instantclient/`ls -l /usr/share/oracle/instantclient/ | grep instantclient_ | awk '{print $9}'`/lib* /usr/share/oracle/instantclient/lib/
    
    # Move rest of the files to /usr/share/oracle/instantclient/
    mv /usr/share/oracle/instantclient/`ls -l /usr/share/oracle/instantclient/ | grep instantclient_ | awk '{print $9}'`/* /usr/share/oracle/instantclient/
    
    # Create symbolic links. Remember to edit version numbers as necessary
    cd /usr/share/oracle/instantclient/lib
    ln -s libclntsh.so.12.1 libclntsh.so
    ln -s libocci.so.12.1 libocci.so


  3. Set $ORACLE_HOME and $LD_LIBRARY_PATH:
    export ORACLE_HOME=/usr/share/oracle/instantclient
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME
  4. Navigate to the site-packages directory of your native Python installation, for example:
    cd /usr/lib64/python2.7/site-packages/
  5. Install some development tools:
    yum install -y python-setuptools python-devel gcc
    easy_install pip
  6. Install cx_Oracle. Ensure that ORACLE_HOME and $LB_LIBRARY_PATH are properly set so that pip knows which version to install.
    echo $ORACLE_HOME
    echo $LD_LIBRARY_PATH
    pip install cx_Oracle
  7. Navigate to Hue's python environment, $HUE_HOME/build/env/lib/<python version>/site-packages.

    This directory is created during Cluster Installation (packages=step 5/parcels=step 6), so you must have completed this to continue. See Easy Install.


    • CDH Parcel installation:
      cd /opt/cloudera/parcels/`ls -l /opt/cloudera/parcels | grep CDH | tail -1 | awk '{print $9}'`/lib/hue/build/env/lib/python2.7/site-packages
    • CDH package installation:
      cd /usr/lib/hue/build/env/lib/python2.7/site-packages
  8. Move the existing cx_Oracle file:
    mv cx_Oracle-5.1.2-py2.7-linux-x86_64.egg/ cxfoo
  9. Copy the new cx_Oracle module to Hue's python environment:
    cp -a /usr/lib64/python2.7/site-packages/cx_Oracle-5.2.1-py2.7.egg-info .

Connect and Configure

You can connect to the Oracle database at the Database Setup page during installation. If CDH is already installed, see the post-installation steps, Connect and Configure, in the parcels section.

  1. Use the Cloudera Manager Installation Wizard to install CDH with packages and stop at the Database Setup page.
  2. Select Use Custom Database.
  3. Under Hue, set the connection properties to the Oracle database:
    Database Hostname (and port): <fqdn of host with Oracle server>:1521
    Database Type (or engine): Oracle
    Database SID (or name): orcl
    Database Username: hue
    Database Password: <hue database password>
  4. Click Test Connection.
  5. Click Continue when successful.

  6. Continue with the installation and click Finish to complete.
  7. Add support for a multi-threaded environment:
    1. Go to Clusters > Hue > Configuration.
    2. Filter by Category, Hue-service and Scope, Advanced.
    3. Add support for a multi-threaded environment by setting Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini:
      [desktop]
      [[database]]
      options={"threaded":true}
    4. Click Save Changes.
  8. Restart the Hue service: select Actions > Restart and click Restart.
  9. Log on to Hue by clicking Hue Web UI.

Embedded

To recover your connection to the embedded PostgreSQL database, you need the 10-digit alphanumeric password of the Hue database that was displayed on the Database Setup page during installation with the Cloudera Manager.
DB Hostname  = <fqdn of host with cloudera-manager-server-db-2>:7432
DB Type      = PostgreSQL
DB Name      = hue
Username     = hue
Password     = <password displayed on Database Setup page (step 3) of the installation wizard>