Disk Usage Reports

The following reports show HDFS disk usage statistics, either current or historical, by user, group, or directory.

The By Directory reports display information about the directories in the Watched list, so if you are not watching any directories there will be no results found for these reports.

Viewing Current Disk Usage by User, Group, or Directory

These reports show "current" disk usage in both chart and tabular form. The data for these reports comes from the fsimage kept on the NameNode, so the data in a report will be only as current as when the last checkpoint was performed. Typically the checkpoint interval is (by default) once per hour, but if checkpoints are not being performed as frequently, the disk usage report may not be up to date.

To create a disk usage report:

  • Click the report name (link) to produce the resulting report.

Each of these reports show:

Bytes

The logical number of bytes in the files, aggregated by user, group, or directory. This is based on the actual files sizes, not taking replication into account.

Raw Bytes

The physical number of bytes (total disk space in HDFS) used by the files aggregated by user, group, or directory. This does include replication, and so is actually Bytes times the number of replicas.

File and Directory Count

The number of files aggregated by user, group, or directory.

Bytes and Raw Bytes are shown in IEC binary prefix notation (1 GiB = 1 * 230).

The directories shown in the Current Disk Usage by Directory report are the HDFS directories you have set as watched directories. You can add or remove directories to or from the watch list from this report; click the Search Files and Manage Directories button at the top right of the set of reports for the cluster or nameservice (see Designating Directories to Include in Disk Usage Reports).

The report data is also shown in chart format:

  • Move the cursor over the graph to highlight a specific period on the graph and see the actual value (data size) for that period.
  • You can also move the cursor over the user, group, or directory name (in the graph legend) to highlight the portion of the graph for that name.
  • You can right-click within the chart area to save the whole chart display as a single image (a .PNG file) or as a PDF file. You can also print to the printer configured for your browser.

Viewing Historical Disk Usage by User, Group, or Directory

You can use these reports to view disk usage over a time range you define. You can have the usage statistics reported per hour, day, week, month, or year.

To create one of these reports:

  • Click the report name (link) to produce the initial report. This generates a report that shows Raw Bytes for the past month, aggregated daily.

To change the report parameters:

  • Select the Start Date and End Date to define the time range of the report.
  • Select the Graph Metric you want to graph: bytes, raw bytes, or files and directories count.
  • In the Report Period field, select the period over which you want the metrics aggregated. The default is Daily. This affects both the number of rows in the results table, and the granularity of the data points on the graph.
  • Click Generate Report to produce a new report.

As with the current reports, the report data is also presented in chart format, and you can use the cursor to view the data shown on the charts, as well as save and print them.

For weekly or monthly reports, the Date indicates the date on which disk usage was measured.

The directories shown in the Historical Disk Usage by Directory report are the HDFS directories you have set as watched directories (see Designating Directories to Include in Disk Usage Reports).

Downloading Reports as CSV and XLS Files

Any report can be downloaded to your local system as an XLS file (Microsoft Excel 97-2003 worksheet) or CSV (comma-separated value) text file.

To download a report, do one of the following:

  • From the main page of the Report tab, click CSV or XLS link next to in the column to the right of the report name 
  • From any report page, click the Download CSV or Download XLS buttons.

Either of these opens the Open file dialog box where you can open or save the file locally.