This is the documentation for Cloudera 5.2.x.
Documentation for other versions is available at Cloudera Documentation.

The Oozie Service

Cloudera Manager installs the Oozie service as part of the CDH installation.

You can elect to have the service created and started as part of the first run installation wizard. If you elect not to create the service using the installation wizard, you can you the Add Service wizard to perform the installation. The wizard will automatically configure and start the dependent services and the Oozie service. See Adding a Service for instructions.

For information on configuring Oozie for high availability, see Oozie High Availability.

Continue reading:

Redeploying the Oozie ShareLib

Required Role:

Some Oozie actions – specifically DistCp, Streaming, Pig, Sqoop, and Hive – require external JAR files in order to run. Instead of having to keep these JAR files in each workflow's lib folder, or forcing you to manually manage them via the oozie.libpath property on every workflow using one of these actions, Oozie provides the ShareLib. The ShareLib behaves very similarly to oozie.libpath, except that it’s specific to the aforementioned actions and their required JARs.

When you upgrade CDH or switch between MapReduce and YARN computation frameworks, redeploy the Oozie ShareLib as follows:

  1. Go to the Oozie service.
  2. Select Actions > Stop.
  3. Select Actions > Install Oozie ShareLib.
  4. Select Actions > Start.

Adding Schema to Oozie

Required Role:

For CDH 4.x Cloudera Manager configures Oozie to recognize only the schema available in CDH 4.0.0, even though more were added later. If you want to use any additional schema, do the following:
  1. In the Cloudera Manager Admin Console, go to the Oozie service.
  2. Click the Configuration tab.
  3. Click Oozie Server Default Group.
  4. Select the Oozie SchemaService Workflow Extension Schemas property.
  5. Enter the desired schema from Oozie Schema - CDH 5, appending .xsd to each entry.
  6. Click Save Changes to commit the changes.
  7. Restart the Oozie service.
Table 1. Oozie Schema - CDH 5
  CDH 5.2.0 CDH 5.1.0 CDH 5.0.0

distcp

distcp-action-0.1

distcp-action-0.2

distcp-action-0.1

distcp-action-0.2

distcp-action-0.1

distcp-action-0.2

email

email-action-0.1

email-action-0.1

email-action-0.2

email-action-0.1

hive

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.5

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.5

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.5

HiveServer2

hive2-action-0.1

   

oozie-bundle

oozie-bundle-0.1

oozie-bundle-0.2

oozie-bundle-0.1

oozie-bundle-0.2

oozie-bundle-0.1

oozie-bundle-0.2

oozie-coordinator

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-sla

oozie-sla-0.1

oozie-sla-0.2

oozie-sla-0.1

oozie-sla-0.2

oozie-sla-0.1

oozie-sla-0.2

oozie-workflow

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.4.5

oozie-workflow-0.5

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.4.5

oozie-workflow-0.5

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.4.5

oozie-workflow-0.5

shell

shell-action-0.1

shell-action-0.2

shell-action-0.3

shell-action-0.1

shell-action-0.2

shell-action-0.3

shell-action-0.1

shell-action-0.2

shell-action-0.3

sqoop

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

ssh

ssh-action-0.1

ssh-action-0.2

ssh-action-0.1

ssh-action-0.2

ssh-action-0.1

ssh-action-0.2

Table 2. Oozie Schema - CDH 4
  CDH 4.6.0-4.3.0 CDH 4.2.0 CDH 4.1.0 CDH 4.0.0

distcp

distcp-action-0.1

distcp-action-0.2

distcp-action-0.1

distcp-action-0.2

distcp-action-0.1

distcp-action-0.1

email

email-action-0.1

email-action-0.1

email-action-0.1

email-action-0.1

hive

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.5

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.2

hive-action-0.3

hive-action-0.4

hive-action-0.2

oozie-bundle

oozie-bundle-0.1

oozie-bundle-0.2

oozie-bundle-0.1

oozie-bundle-0.2

oozie-bundle-0.1

oozie-bundle-0.2

oozie-bundle-0.1

oozie-coordinator

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-coordinator-0.4

oozie-coordinator-0.1

oozie-coordinator-0.2

oozie-coordinator-0.3

oozie-sla

oozie-sla-0.1

oozie-sla-0.1

oozie-sla-0.1

oozie-sla-0.1

oozie-workflow

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.4.5

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

oozie-workflow-0.4

oozie-workflow-0.1

oozie-workflow-0.2

oozie-workflow-0.2.5

oozie-workflow-0.3

shell

shell-action-0.1

shell-action-0.2

shell-action-0.3

shell-action-0.1

shell-action-0.2

shell-action-0.3

shell-action-0.1

shell-action-0.2

shell-action-0.3

shell-action-0.1

sqoop

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

sqoop-action-0.2

sqoop-action-0.3

sqoop-action-0.4

sqoop-action-0.2

ssh

ssh-action-0.1

ssh-action-0.2

ssh-action-0.1

ssh-action-0.1

ssh-action-0.1

Enabling the Oozie Web Console

Required Role:

  1. Download ext-2.2. Extract the contents of the file to /var/lib/oozie/ on the same host as the Oozie Server.
  2. In the Cloudera Manager Admin Console, go to the Oozie service.
  3. Click the Configuration tab.
  4. Check Enable Oozie server web console.
  5. Click Save Changes to commit the changes.
  6. Restart the Oozie service.

Using an External Database for Oozie

The default database for Oozie is Derby. Cloudera recommends that you use a production database instead, for the following reasons:
  • Derby runs in embedded mode and it is not possible to monitor its health.
  • It is not clear how to implement a live backup strategy for the embedded Derby database, though it may be possible.
  • Under load, Cloudera has observed locks and rollbacks with the embedded Derby database which don't happen with server-based databases.
The databases that Oozie supports are listed at:

Continue reading:

Setting up an External Database for Oozie

See PostgreSQL, MySQL, or Oracle for the procedure for setting up an Oozie database.

PostgreSQL

Use the procedure that follows to configure Oozie to use PostgreSQL instead of Apache Derby.

  1. Install PostgreSQL 8.4.x or 9.0.x.
  2. Create the Oozie user and Oozie database.
  3. Configure PostgreSQL to accept network connections for the oozie user.
  4. Reload the PostgreSQL configuration.
Install PostgreSQL 8.4.x or 9.0.x.
Create the Oozie user and Oozie database.

For example, using the PostgreSQL psql command-line tool:

$ psql -U postgres
Password for user postgres: *****

postgres=# CREATE ROLE oozie LOGIN ENCRYPTED PASSWORD 'oozie' 
 NOSUPERUSER INHERIT CREATEDB NOCREATEROLE;
CREATE ROLE

postgres=# CREATE DATABASE "oozie" WITH OWNER = oozie
 ENCODING = 'UTF8'
 TABLESPACE = pg_default
 LC_COLLATE = 'en_US.UTF8'
 LC_CTYPE = 'en_US.UTF8'
 CONNECTION LIMIT = -1;
CREATE DATABASE

postgres=# \q
Configure PostgreSQL to accept network connections for the oozie user.
  1. Edit the postgresql.conf file and set the listen_addresses property to *, to make sure that the PostgreSQL server starts listening on all your network interfaces. Also make sure that the standard_conforming_strings property is set to off.
  2. Edit the PostgreSQL data/pg_hba.conf file as follows:
    host    oozie         oozie         0.0.0.0/0             md5

Edit the PostgreSQL data/pg_hba.conf file as follows:

host    oozie         oozie         0.0.0.0/0             md5
Reload the PostgreSQL configuration.
$ sudo -u postgres pg_ctl reload -s -D /opt/PostgreSQL/8.4/data

MySQL

Use the procedure that follows to configure Oozie to use MySQL instead of Apache Derby.

  1. Install and start MySQL 5.x
  2. Create the Oozie database and Oozie MySQL user.
  3. Add the MySQL JDBC driver JAR to Oozie.
Install and start MySQL 5.x
Create the Oozie database and Oozie MySQL user.

For example, using the MySQL mysql command-line tool:

$ mysql -u root -p
Enter password: ******

mysql> create database oozie;
Query OK, 1 row affected (0.03 sec)

mysql>  grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie';
Query OK, 0 rows affected (0.03 sec)

mysql>  grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie';
Query OK, 0 rows affected (0.03 sec)

mysql> exit
Bye
Add the MySQL JDBC driver JAR to Oozie.
Copy or symbolically link the MySQL JDBC driver JAR into the /var/lib/oozie/ directory.
  Note: You must manually download the MySQL JDBC driver JAR file.

Oracle

Use the procedure that follows to configure Oozie to use Oracle 11g instead of Apache Derby.

  1. Install and start Oracle 11g.
  2. Create the Oozie Oracle user.
  3. Add the Oracle JDBC driver JAR to Oozie.
Install and start Oracle 11g.

Use Oracle's instructions.

Create the Oozie Oracle user.

For example, using the Oracle sqlplus command-line tool:

$ sqlplus system@localhost

Enter password: ******

SQL> create user oozie identified by oozie default tablespace users temporary tablespace temp;

User created.

SQL> grant all privileges to oozie;

Grant succeeded.

SQL> exit

$
Add the Oracle JDBC driver JAR to Oozie.

Copy or symbolically link the Oracle JDBC driver JAR into the /var/lib/oozie/ directory.

  Note: You must manually download the Oracle JDBC driver JAR file.

Creating the Oozie Database Schema

After configuring Oozie database information and creating the corresponding database, create the Oozie database schema. Oozie provides a database tool for this purpose.
  Note: The Oozie database tool uses Oozie configuration files to connect to the database to perform the schema creation; before you use the tool, make you have created a database and configured Oozie to work with it as described above.

The Oozie database tool works in 2 modes: it can create the database, or it can produce an SQL script that a database administrator can run to create the database manually. If you use the tool to create the database schema, you must have the permissions needed to execute DDL operations.

To run the Oozie database tool against the database
  Important: This step must be done as the oozie Unix user, otherwise Oozie may fail to start or work properly because of incorrect file permissions.
$ sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -run

You should see output such as the following (the output of the script may differ slightly depending on the database vendor) :

Validate DB Connection.
DONE
Check DB schema does not exist
DONE
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
DONE
Create OOZIE_SYS table
DONE

Oozie DB has been created for Oozie version '4.0.0-cdh5.0.0'

The SQL commands have been written to: /tmp/ooziedb-5737263881793872034.sql
To create the upgrade script
  Important: This step must be done as the oozie Unix user, otherwise Oozie may fail to start or work properly because of incorrect file permissions.

Run /usr/lib/oozie/bin/ooziedb.sh create -sqlfile SCRIPT. For example:

$ sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -sqlfile oozie-create.sql

You should see output such as the following (the output of the script may differ slightly depending on the database vendor) :

Validate DB Connection.
DONE
Check DB schema does not exist
DONE
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
DONE
Create OOZIE_SYS table
DONE

Oozie DB has been created for Oozie version '4.0.0-cdh5.0.0'

The SQL commands have been written to: oozie-create.sql

WARN: The SQL commands have NOT been executed, you must use the '-run' option
  Important: If you used the -sqlfile option instead of -run, Oozie database schema has not been created. You must run the oozie-create.sql script against your database.

Configuring Oozie to Use an External Database

Required Role:

  1. In the Cloudera Manager Admin Console, go to the Oozie service.
  2. Click the Configuration tab.
  3. Expand Oozie Server Default Group and click Database.
  4. Specify the settings for Oozie Server database type, Oozie Server database name, Oozie Server database host, Oozie Server database user, and Oozie Server database password.
  5. Click Save Changes to commit the changes.
  6. Restart the Oozie service.