Hue Reference Architecture

This document provides a reference architecture for deploying Hue. It is a guide to assist with deployment and sizing options.

In practice, each Hue server can support approximately 25 concurrent users, depending on what tasks the users are performing. Most scaling issues occur as a result of users performing resource-intensive operations and not from the number of users. For example, large downloads of query results can impact resource availability for the other users who are using the same Hue instance during the download operation. During that time, the users can experience slow performance. Another common cause of noticeable performance changes are slow RPC calls between Hue and another service. When this happens, queries may appear to suddenly "hang" after they are submitted.

As a guide, 2 Hue servers can support up to:

  • 100 unique users per week
  • 50 users per hour at peak times executing up to 100 queries

A typical setup is 2 Hue servers.

General Guidelines

  • Deploy a load balancer in front of Hue.
  • Use a production-quality database. For more information, see Hue Custom Databases.
  • Ensure that other services, such as Impala, Hive, and Oozie, are healthy and not impacted by too few resources. If these services are hanging, it adversely affects Hue performance.
  • Consider moving workloads that are subject to SLAs (service-level agreements) or considered "noisy neighbors" to their own compute cluster. Noisy neighbors are workloads that use the majority of available resources and cause performance issues. For more information about separating compute and storage, see Virtual Private Clusters and Cloudera SDX.
  • Limit the number of rows that are returned for queries.

    One way to limit the number of rows returned is to specify a value for the download_row_limit configuration property for the Hue Beeswax application. This property can be set in the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini property in Cloudera Manager:
    1. In Cloudera Manager, click Hue > Configuration, and enter Hue Service Advanced Configuration Snippet in the search text box.
    2. In the text box for the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, add the following configuration information:

      [beeswax]
      download_row_limit=number_of_rows
              
    3. Click Save Changes and click the restart icon at the top of the page to restart the Hue service:



  • Upgrade to CDH 5.15 or later, which includes Hue version 4.2. In Hue 4.2 and later, there are better query submission controls on the backend and you also gain the ability to visualize queued queries.