Job Designer

The Job Designer application enables you to create and submit Hadoop MapReduce jobs to the Hadoop cluster. You can include variables with your jobs to enable you and other users to enter values for the variables when they run your job. The Job Designer supports MapReduce, streaming, and Java jobs. For more information about Hadoop MapReduce, see the Hadoop Tutorial.

  Note:
  • Job Designer uses Oozie to submit MapReduce jobs. Therefore, Oozie must be installed and configured before you can use JobDesigner. For information about installing Oozie, see Oozie Installation.
  • In order to run Streaming jobs as part of a workflow, Oozie must be configured to use the Oozie ShareLib.
  • A job's input files must be uploaded to the cluster before you can submit the job.

Job Designer Installation and Configuration

Job Designer is one of the applications installed as part of Hue. For information about installing and configuring Hue, see Hue Installation.

Starting Job Designer

To start Job Designer, click the Job Designer icon ( images/image12.png ) in the navigation bar at the top of the Hue web page. The Job Designs page opens in the browser.

Installing the Sample Job Designs

The Job Designer sample job designs can help you learn how to use Job Designer. To install the sample job designs, click Install Samples in the Job Designs window and then click Yes. The sample job designs are displayed in the Job Designs window. Job Designer removes the Install Samples button after the samples are installed so you can only install the samples once. 

Job Designs

A job design specifies several meta-level properties of a MapReduce job, including the job design name, description, the MapReduce executable scripts or classes, and any parameters for those scripts or classes. You can create three types of job designs: MapReduce, streaming, and Java.

Filtering Job Designs

You can filter the job designs that appear in the list by owner, name, type, and description. 

To filter the Job Designs list:

  1. In the Job Designs window, click Designs.
  2. Enter text in the Filter text box at the top of the Job Designs window. When you type in the Filter field, the designs are dynamically filtered to display only those rows containing text that matches the specified substring.

Job Design Settings

All job design settings except Name and Description support the use of variables of the form $variable_name. When you run the job, a dialog box will appear to enable you to specify the values of the variables.

Common Settings

All job design types support the settings listed in the following table. 

Setting

Description

Name

Identifies the job and its collection of properties and parameters.

Description

A description of the job. The description is displayed in the dialog box that appears if you specify variables for the job.

Job Properties

Job properties. To set a property value, click Add Property.  

  • Property name - a configuration property name. This field provides auto-completion, so you can type the first few characters of a property name and then select the one you want from the drop-down list. 
  • Value - the property value.

Files

Files to pass to the job. Equivalent to the Hadoop -files option. 

Archives

Archives to pass to the job. Equivalent to the Hadoop -archives option.

Creating a MapReduce Job Design

A MapReduce job design consists of MapReduce functions written in Java. You can create a MapReduce job design from existing mapper and reducer classes without having to write a main Java class. You must specify the mapper and reducer classes as well as other MapReduce properties in the Job Properties setting.

To create a MapReduce job design:

  1. In the Job Designs window, click Create MapReduce Design.
  2. In the Job Design (MapReduce type) window, specify the common and job type specific information.

    Setting

    Description

    Jar path

    The fully-qualified path to a JAR file containing the classes that implement the Mapper and Reducer functions.

  3. Click Save to save the job settings.

Creating a Streaming Job Design

Hadoop streaming jobs enable you to create MapReduce functions in any non-Java language that reads standard Unix input and writes standard Unix output. For more information about Hadoop streaming jobs, see Hadoop Streaming.

To create a streaming job design:

  1. In the Job Designs window, click Create Streaming Design.
  2. In the Job Design (streaming type) window, specify the common and job type specific information.

    Setting

    Description

    Mapper

    The path to the mapper script or class. If the mapper file is not on the machines on the cluster, use the Files option to pass it as a part of job submission. Equivalent to the Hadoop -mapper option.

    Reducer

    The path to the reducer script or class. If the reducer file is not on the machines on the cluster, use the Files option to pass it as a part of job submission. Equivalent to the Hadoop -reducer option.

  3. Click Save to save the job settings.

Creating a Java Job Design

A Java job design consists of a main class written in Java.

To create a Java job design:

  1. In the Job Designs window, click Create Java Design.
  2. In the Job Design (java type) window, specify the common and job type specific information.

    Setting

    Description

    Jar path

    The fully-qualified path to a JAR file containing the main class.

    Main class

    The main class to invoke the program.

    Args

    The arguments to pass to the main class.

    Java opts

    The options to pass to the JVM.

  3. Click Save to save the job settings.

Submitting a Job Design

To submit a job design:

  1. In the Job Designs window, click Designs in the upper left corner. Your jobs and other users' jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to submit.
  3. Click the Submit button.
    1. If the job contains variables, enter the information requested in the dialog box that appears. For example, the sample grep MapReduce design displays a dialog where you specify the output directory.
    2. Click Submit to submit the job.

After the job is complete, the Job Designer displays the results of the job. For information about displaying job results, see Displaying the Results of Submitting a Job.

Copying, Editing, and Deleting a Job Design

If you want to edit and use a job but you don't own it, you can make a copy of it and then edit and use the copied job.

To copy a job design:

  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to clone.
  3. Click the Clone button.
  4. In the Job Design Editor window, change the settings and then click Save to save the job settings.

To edit a job design:

  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to edit.
  3. Click the Edit button.
  4. In the Job Design window, change the settings and then click Save to save the job settings.

To delete a job design:

  1. In the Job Designs window, click Designs. The jobs are displayed in the Job Designs window.
  2. Check the checkbox next to the job you want to delete.
  3. Click the Delete button.
  4. Click Ok to confirm the deletion.

Displaying Results of Submitting a Job

To display the Job Submission History:

In the Job Designs window, click the History tab. The jobs are displayed in the Job Submissions History listed by Oozie job ID.

To display Job Details:

In the Job Submission History window, click an Oozie Job ID. The results of the job display:

  • Actions - a list of actions in the job.
  • Click images/image13.png to display the action configuration. In the action configuration for a MapReduce action, click the value of the mapred.output.dir property to display the job output.
  • In the root-node row, click the Id in the External Id column to view the job in the Job Browser.
  • Details - the job details. Click images/image13.png to display the Oozie application configuration.
  • Definition - the Oozie application definition.
  • Log - the output log.