Oozie Editor and Dashboard

The Oozie Editor/Dashboard application allows you to define Oozie workflow and coordinator applications, run workflow and coordinator jobs, and view the status of jobs. For information about Oozie, see Oozie Documentation.

A workflow application is a collection of actions arranged in a directed acyclic graph (DAG). It includes control flow nodes (start, end, fork, join, decision, and kill) and action nodes (MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, Email, Sub-workflow, and Generic).

A coordinator application allows you to define and execute recurrent and interdependent workflow jobs. The coordinator application defines the conditions under which the execution of workflows can occur.

Oozie Editor/Dashboard Installation and Configuration

Oozie Editor/Dashboard is one of the applications installed as part of Hue. For information about installing and configuring Hue, see Hue Installation.

  Note:

In order to run DistCp, Streaming, Pig, Sqoop, and Hive jobs as part of a workflow, Oozie must be configured to use the Oozie ShareLib. See Oozie Installation.

Starting Oozie Editor/Dashboard

To start Oozie Editor/Dashboard, click the Oozie Editor/Dashboard icon ( images/image14.png ) in the navigation bar at the top of the Hue browser page. Oozie Editor/Dashboard opens with the following screens:

  • Dashboard - shows the running and completed workflow and coordinator jobs. The screen is selected and opened to the Workflows page.
  • Workflows - shows available workflows.
  • Coordinators - shows available coordinators.
  • History - shows a list of submitted jobs.

Installing Oozie Editor/Dashboard Samples

The Oozie Editor/Dashboard sample workflows and coordinators can help you learn how to use Oozie Editor/Dashboard. To install the samples:

  1. Click the Workflows tab.
  2. Click the Setup App button. This action adds samples demonstrating all the types of actions to the Workflows Editor and samples to the Coordinator Editor. It also creates workspaces and deployment directories required by the samples in /user/hue/oozie.

Filtering Lists in Oozie Editor/Dashboard

The Dashboard, Workflows, Coordinators, and History screens contain lists of workflows, coordinators, and jobs. When you type in the Filter field on these screens, the lists are dynamically filtered to display only those rows containing text that matches the specified substring.

Permissions in Oozie Editor/Dashboard

In the Dashboard workflows and coordinators can only be viewed, submitted, and modified by its owner or a superuser.

Editor permissions for performing actions on workflows and coordinators are summarized in the following table:

Action

Superuser or Owner

All

View

Y

Only if "Is shared" is set

Submit

Y

Only if "Is shared" is set

Modify

Y

N

Oozie Dashboard

Oozie Dashboard shows a summary of the running and completed workflow and coordinator jobs.

You can view jobs for a period up to the last 30 days.

You can filter the list by date (1, 7, 15, or 30 days) or status (Succeeded, Running, or Killed). The date and status buttons are toggles.

Workflows

Click the Workflows tab to view the running and completed workflow jobs for the filters you have specified.

Click a workflow row in the Running or Completed table to view detailed information about that workflow job.

For the selected job, the following information is available.

  • The Graph tab shows the workflow DAG.
  • The Actions tab shows you details about the actions that make up the workflow.
    • Click the Id link to see additional details about the action.
    • Click the External Id link to view the job in the Job Browser.
  • The Details tab shows job statistics including start and end times, and provides a link to the workflow definition in the File Browser.
  • The Configuration tab shows selected job configuration settings.
  • The Logs tab shows log output generated by the workflow job.
  • The Definition tab shows the Oozie workflow definition, as it appears in the workflow.xml file (also linked under the application path properties in the Details tab and the Configuration tab).

Coordinators

Click the Coordinators tab to view the running and completed coordinator jobs for the filters you have specified.

For the selected job, the following information is available.

  • The Calendar tab shows the timestamp of the job. Click the timestamp to open the workflow DAG. 
  • The Actions tab shows you details about the actions that make up the coordinator.
    • Click the Id link to see additional details about the action.
    • Click the External Id link to view the job in the Job Browser.
  • The Configuration tab shows selected job configuration settings.
  • The Logs tab shows log output generated by the coordinator.
  • The Definition tab shows the Oozie coordinator definition, as it appears in the coordinator.xml file (also linked under the oozie.coord.application.path property in the  Configuration tab).

Workflow Manager

In Workflow Manager you create Oozie workflows and submit them for execution.

Click the Workflows tab to open the Workflow Manager.

Each row shows a workflow: its name, description, timestamp of its last modification. It also shows:

  • Steps: the number of steps in the workflow execution path. This is the number of execution steps between the start and end of the workflow. This will not necessarily be the same as the number of actions in the workflow, if there are control flow nodes in the control path.
  • Status: who can run the workflow. shared means users other than the owner can access the workflow. personal means only the owner can modify or submit the workflow. The default is personal.
  • Owner: the user that created the workflow.

In Workflow Editor you edit workflows that include MapReduce, Streaming, Java, Pig, Hive, Sqoop, Shell, Ssh, DistCp, Fs, Email, Sub-workflow, and Generic actions. You can configure these actions in the Workflow Editor, or you can import job designs from Job Designer to be used as actions in your workflow. For information about defining workflows, see the Workflow Specification.

Installing the Sample Workflows

  1. Click the Setup Examples button at the top right.

Opening a Workflow

To open a workflow, in Workflow Manager, click the workflow. Proceed with Editing a Workflow.

Creating a Workflow

  1. Click the Create button at the top right.
  2. In the Name field, type a name.
  3. Click advanced to specify whether the workflow is shared, the deployment directory, or a job.xml file.
  4. Click Save. The Workflow Editor opens. Proceed with Editing a Workflow.

Importing a Workflow

  1. Click the Import button at the top right.
  2. In the Name field, type a name.
  3. In the Local workflow.xml file field, click Choose File and select a workflow file.
  4. Click advanced to specify whether the workflow is shared, the deployment directory, or a job.xml file.
  5. Click Save. The Workflow Editor opens. Proceed with Editing a Workflow.

Submitting a Workflow

To submit a workflow for execution, do one of the following:

  • In the Workflow Manager, click the radio button next to the workflow, and click the Submit button.
  • In the Workflow Editor, click the Submit button.

The workflow job is submitted and the Dashboard displays the workflow job.

To view the output of the job, click images/image11.png View the logs.

Suspending a Running Job

In the pane on the left, click the Suspend button.

  1. Verify that you want to suspend the job.

Resuming a Suspended Job

In the pane on the left, click the Resume button.

  1. Verify that you want to resume the job.

Rerunning a Workflow

In the pane on the left, click the Rerun button.

  1. Check the checkboxes next to the actions to rerun.
  2. Specify required variables.
  3. Click Submit.

Scheduling a Workflow

To schedule a workflow for recurring execution, do one of the following:

  • In the Workflow Manager, click the radio button next to the workflow and click the Schedule button.
  • In the Workflow Editor, click the Schedule button.

A coordinator is created and opened in the Coordinator Editor. Proceed with Editing a Coordinator.

Editing a Workflow

In the Workflow Editor you can easily perform operations on Oozie action and control nodes.

Action Nodes

The Workflow Editor supports dragging and dropping action nodes. As you move the action over other actions and forks, highlights indicate active areas. If there are actions in the workflow, the active areas are the actions themselves and the areas above and below the actions. If you drop an action on an existing action, a fork and join is added to the workflow.

  • Add actions to the workflow by doing one of the following:
    • Click an action (images/image15.png) button and drop the action on the workflow. The Edit Node screen displays.
      1. Set the action properties and click Done. Each action in a workflow must have a unique name.
  • Click the Import action link to import an existing job design. The Import Action screen displays.
    1. Click a radio button next to a job design and click Import. The action is added to the end of the workflow.
  • Clone an action by clicking the images/image16.png button.
  1. The action is opened in the Edit Node screen.
  2. Edit the action properties and click Done. The action is added to the end of the workflow.
  • Delete an action by clicking the images/image17.png button.
  • Edit an action by clicking the images/image18.png button.
  • Change the position of an action by left-clicking and dragging an action to a new location.

Control Nodes

  • Create a fork and join by dropping an action on top of another action.
  • Remove a fork and join by dragging a forked action and dropping it above the fork.
  • Convert a fork to a decision by clicking the images/image19.png button.
  • To edit a decision:
    1. Click the images/image18.png button.
    2. Fill in the predicates that determine which action to perform and select the default action from the drop-down list.
    3. Click Done.

Uploading Workflow Files

In the Workflow Editor, click the Upload button.

Editing Workflow Properties

  1. In the Workflow Editor, click the link under the Name or Description fields in the left pane.
  2. To share the workflow with all users, check the Is shared checkbox.
  3. To set advanced execution options, click advanced and edit the deployment directory, add parameters and job properties, or specify a job.xml file.
  4. Click Save.

Displaying the History of a Workflow

  1. Do one of the following:
    • In the Workflow Editor, click Show history in the pane at the left. Click a job.
    • In the Oozie Dashboard/Editor, click the History tab. Click a submission Id.

Coordinator Manager

In Coordinator Manager you create Oozie coordinator applications and submit them for execution.

Click the Coordinators tab to open the Coordinator Manager.

Each row shows a coordinator: its name, description, timestamp of its last modification. It also shows:

  • Workflow: the workflow that will be run by the coordinator.
  • Frequency: how often the workflow referenced by the coordinator will be run.
  • Status: who can run the coordinator. shared means users other than the owner can access the workflow. personal means only the owner can modify or submit the workflow. The default is personal.
  • Owner: the user that created the workflow.

In Coordinator Editor, you edit coordinators and the datasets required by the coordinators. For information about defining coordinators and datasets, see the Coordinator Specification.

Opening a Coordinator

To open a coordinator, in Coordinator Manager, click the coordinator. Proceed with Editing a Coordinator.

Creating a Coordinator

To create a coordinator, in Coordinator Manager:

  1. Click the Create button at the top right. The Coordinator wizard opens. Proceed with Editing a Coordinator.

Submitting a Coordinator

To submit a coordinator for execution, click the radio button next to the coordinator and click the Submit button.

Editing a Coordinator

In the Coordinator Editor you specify coordinator properties and the datasets on which the workflow scheduled by the coordinator will operate by stepping through screens in a wizard. You can also advance to particular steps and revisit steps by clicking the Step "tabs" above the screens. The following instructions walk you through the coordinator wizard.

  1. Type a name, select the workflow, check the Is shared checkbox to share the job, and click Next. If the Coordinator Editor was opened after scheduling a workflow, the workflow will be set.
  2. Select how many times the communicator will run for each specified unit, the start and end times of the coordinator, the timezone of the start and end times, and click Next. The start and end times must be expressed as UTC times. For example, to run at 10 pm PST, specify a start time of 6 am UTC of the following day (+8 hours) and set the Timezone field to America/Los_Angeles.
  3. Click Add to select an input dataset and click Next. If no datasets exist, follow the procedure in Creating a Dataset.
  4. Click Add to select an output dataset. Click Save coordinator or click Next to specify advanced settings.
  5. To share the coordinator with all users, check the Is shared checkbox.
  6. Fill in parameters to pass to Oozie, properties that determine how long a coordinator will wait before timing out, how many coordinators can run and wait concurrently, and the coordinator execution policy.
  7. Click Save coordinator.

Creating a Dataset

  1. In the Coordinator Editor, do one of the following:
    • Click here in the Inputs or Outputs pane at the top of the editor.
    • In the pane at the left, click the Create new link. Proceed with Editing a Dataset.

Displaying Datasets

  1. In the Coordinator Editor, click Show existing in pane at the left.
  2. To edit a dataset, click the dataset name in the Existing datasets table. Proceed with Editing a Dataset.

Editing a Dataset

  1. Type a name for the dataset.
  2. In the Start and Frequency fields, specify when and how often the dataset will be available.
  3. In the URI field, specify a URI template for the location of the dataset. To construct URIs and URI paths containing dates and timestamps, you can specify the variables ${YEAR},${MONTH},${DAY},${HOUR},${MINUTE}. For example: hdfs://foo:9000/usr/app/stats/${YEAR}/${MONTH}/data.
  4. In the Instance field, click a button to choose a default, single, or range of data instances. For example, if frequency==DAY, a window of the last rolling 5 days (not including today) would be expressed as start: -5 and end: -1. Check the advanced checkbox to display a field where you can specify a coordinator EL function.
  5. Specify the timezone of the start date.
  6. In the Done flag field, specify the flag that identifies when input datasets are no longer ready.

Displaying the History of a Coordinator

  1. Do one of the following:
    • In the Coordinator Editor, click Show history in the pane at the left. Click a job.
    • In the Oozie Dashboard/Editor, click the History tab. Click a coordinator.