Viewing and Filtering MapReduce Activities

This section describes the various actions you can perform in the MapReduce Activities page:

Viewing MapReduce Activities

  1. Select Clusters > Cluster name > MapReduce service name Jobs. The MapReduce service name page displays a list of activities. The columns in the Activities list show statistics about the performance of and resources used by each activity (and you can modify the default display by adding or removing columns).
    • The leftmost column holds a shortcut menu button (). Click this button to display a menu of commands relevant to the job shown in that row. The possible commands are:

      Children

      For a Pig, Hive or Oozie activity, takes you to the Children tab of the individual activity page. You can also go to this page by clicking the activity ID in the activity list. This command only appears for Pig, Hive or Oozie activities.

      Tasks

      For a MapReduce job, takes you to the Tasks tab of the individual job page. You can also go to this page by clicking the job ID in the activity or activity children list. This command only appears for a MapReduce job.

      Details

      Takes you to the Details tab where you can view the activity or job statistics in report form.

      Compare

      Takes you to the Compare tab where you can see how the selected activity compares to other similar activities in terms of a wide variety of metrics.

      Task Distribution

      Takes you to the Task Distribution tab where you can view the distribution of task attempts that made up this job, by amount of data and task duration. This command is available for MapReduce and Streaming jobs.

      Kill Job

      A pop-up asks for confirmation that you want to kill the job. This command is available only for MapReduce and Streaming jobs.

    • The second column shows a chart icon (). Select this to chart statistics for the job. If there are charts showing similar statistics for the cluster or for other jobs, the statistics for the job are added to the chart. See Activity Charts for more details.
    • The third column shows the status of the job, if the activity is a MapReduce job:

      The job has been submitted.

      The job has been started.

      The job is assumed to have succeeded.

      The job has finished successfully.

      The job's final state is unknown.

      The job has been suspended.

      The job has failed.

      The job has been killed.

    • The fourth column shows the type of activity:

      MapReduce job

      Pig job

      Hive job

      Oozie job

      Streaming job

Selecting Columns to Show in the Activities List

In the Activities list, you can display or hide any of the statistics that Cloudera Manager collects. By default only a subset of the possible statistics are displayed.

  1. Click the Select Columns to Display icon (). A pop-up panel lets you turn on or off a variety of metrics that may be of interest.
  2. Check or uncheck the columns you want to include or remove from the display. As you check or uncheck an item, its column immediately appears or disappears from the display.
  3. Click the in the upper right corner to close the panel.

Sorting the Activities List

You can sort the Activities list by the contents of any column:

  1. Click the column header to initiate a sort. The small arrow that appears next to the column header indicates the sort direction.
  2. Click the column header to reverse the sort direction.

Filtering the Activities List

You can filter the list of activities based on values of any of the metrics that are available. You can also easily filter for certain common queries from the drop-down menu next to the Search button at the top of the Activities list. By default, it is set to show All Activities.

To use one of the predefined filters:
  • Click the to the right of the Search button and select the filter you want to run. There are predefined filters to search by job type (for example Pig activities, MapReduce jobs, and so on) or for running, failed, or long-running activities.
To create a filter:
  1. Click the to the right of the Search button and select Custom.
  2. Select a metric from the drop-down list in the first field; you can create a filter based on any of the available metrics.
  3. Once you select a metric, fill in the rest of the fields; your choices depend on the type of metric you have selected. Use the percent character % as a wildcard in a string; for example, Id matches job%0001 will look for any MapReduce job ID with suffix 0001.
  4. To create a compound filter, click the plus icon at the end of the filter row to add another row. If you combine filter criteria, all criteria must be true for an activity to match.
  5. To remove a filter criteria from a compound filter, click the minus icon at the end of the filter row. Removing the last row removes the filter.
  6. To include any children of a Pig, Hive, or Oozie activity in your search results, check the Include Child Activities checkbox. Otherwise, only the top-level activity will be included, even if one or more child activities matched the filter criteria.
  7. Click the Search button (which appears when you start creating the filter) to run the filter.

Activity Charts

By default the charts show aggregated statistics about the performance of the cluster: Tasks Running, CPU Usage, and Memory Usage. There are additional charts you can enable from a pop-up panel. You can also superimpose individual job statistics on any of the displayed charts.

Most charts display multiple metrics within the same chart. For example, the Tasks Running chart shows two metrics: Cluster, Running Maps and Cluster, Running Reduces in the same chart. Each metric appears in a different color.

  • To see the exact values at a given point in time, move the cursor over the chart – a movable vertical line pinpoints a specific time, and a tooltip shows you the values at that point.
  • You can use the time range selector at the top of the page to zoom in – the chart display will follow. In order to zoom out, you can use the Time Range Selector at the top of the page or click the link below the chart.

To select additional charts:

  1. Click at the top right of the chart panel to open the Customize dialog box.
  2. Check or uncheck the boxes next to the charts you want to show or hide.

To show or hide cluster-wide statistics:

  • Check or uncheck the Cluster checkbox at the top of the Charts panel.

To chart statistics for an individual job:

  • Click the chart icon () in the row next to the job you want to show on the charts. The job ID will appear in the top bar next to the Cluster checkbox, and the statistics will appear on the appropriate chart.
  • To remove a job's statistics from the chart, click the next to the job ID in the top bar of the chart.

To expand, contract, or hide the charts

  • Move the cursor over the divider between the Activities list and the charts, grab it and drag to expand or contract the chart area compared to the Activities list.
  • Drag the divider all the way to the right to hide the charts, or all the way to the left to hide the Activities list.