Issues Fixed in Cloudera Director

Issues Fixed in Cloudera Director 2.5.0

AWS IAM permission for RDS required even when RDS not in use

When Cloudera Director validates an environment definition, it performs a call to AWS that requires the rds:DescribeDBSecurityGroups IAM permission. This is true whether or not RDS is to be used for any deployments or clusters in the environment.

Workaround: Include the rds:DescribeDBSecurityGroups permission in the IAM permissions for the user account defined in the environment; if no user is defined, then include the permission in the permission for the IAM role associated with the instance profile of the Director instance.

Cloudera Director masks real reason for cluster bootstrap or update failure during host installation

When Cloudera Director tries to add a host to Cloudera Manager, and errors occur that trigger retry, Cloudera Director can produce an exception that includes the following message while retrying:
'JdbcSQLException: Value too long for column "CALL_STACK' 
This is an internal error that masks the root cause, which may be found earlier in the Cloudera Director log, or in the Cloudera Manager log.

Preemptive proxy authentication not working as expected

Preemptive proxy authentication does not work as intended and currently has no effect.

Workaround: Use the proxy without preemptive proxy authentication.

Restrictive umask prevents bootstrap

Cloudera Director inappropriately sets the file ownership when creating database scripts for use by Cloudera Manager. When using a more restrictive umask with Cloudera Director, this can prevent deployment bootstrap from succeeding properly.

Workaround: Use the default umask supplied with your filesystem, or use a more permissive umask.

First run fails for cluster with HA enabled for HDFS cloned through Cloudera Director

If you create a non-HA cluster through Cloudera Director with a Cloudera Manager version prior to 5.12, and then enable HA using the Cloudera Manager wizard, Cloudera Director cannot clone the cluster.

Workaround: In order to be able to clone such a cluster, update Cloudera Manager to 5.12 or later, and wait for Cloudera Director to refresh its internal cluster model, before enabling HA through the Cloudera Manager wizard.

Presence of external databases prevents deletion of environments

When you delete a database server, we check to see if it is an external db or a registered db. If it is external, we show the delete action: 'Terminate Database Server.' The delete action for an external database should be shown as "Unregister Database Server." Once the external database is unregistered, the environment can be safely deleted.

Refreshing a deployment may overwrite the deployment template

User updates to the deployment template for a deployment may be inadvertently overwritten by refreshing the deployment if they are performed at the same time. This currently only impacts Cloudera Manager credentials.

Workaround: Re-update the Cloudera Manager credentials.

Cloudera Director may fail to repair master roles for certain services when repairing an instance

When repairing an instance with master roles, Cloudera Director may incorrectly attempt to automatically configure those roles instead of running the role migration. This may lead to update failure.

Use numberOfHealthCacheExecutorThreads when provided

Previous versions of Cloudera Director ignored numberOfHealthCacheExecutorThreads when specified in the server configuration file.

Use numberOfCacheExecutorThreads when provided

Previous versions of Cloudera Director ignored numberOfCacheExecutorThreads when specified in the server configuration file.

Terminating cluster creation at the wrong time can leak instances

EC2 instances can potentially leak instances when terminated during deployment or cluster bootstrapping.

Workaround: Delete the leaked instances manually.

Unhealthy host(s) causes 'apply host template' failure when growing the Cluster

When growing an existing cluster, the update operation may fail to add the instances. If the server log shows API call to Cloudera Manager failed. Method=HostTemplatesResource.applyHostTemplate, the user can check the CM API debug logs. One of the reasons for failure could be that the CDH parcel hasn't been activated by the time Cloudera Director attempts to apply the host template. This specific scenario is likely to happen if newly added instances show up as unhealthy in Cloudera Manager, which can cause parcel distribution errors.

Workaround: The best course of action is to try to figure out why the newly-added instance(s) comes up as unhealthy. This can sometimes be fixed by using a different AMI or instance type. If that doesn't work, Cloudera Director's lp.update.sleepTimeForAddInstanceSeconds server property (added in Cloudera Director 2.4.1) can be increased to add additional time for the host to come back as healthy so that the parcel gets distributed and activated before the API call to apply host template.

Shrink operation fails to complete

Host decommissioning can hang in Cloudera Manager, causing a cluster shrink operation to fail to complete successfully.

Cluster termination fails during backup of Cloudera Manager configuration files

When a cluster is terminated while Cloudera Director is backing up Cloudera Manager configuration files, it is possible for Cloudera Director to hang attempting to clean up the associated pipelines.

Erroneous error message during cluster creation with Spark 2

When creating a cluster that includes the Spark 2 service with bootstrap-remote, the Cloudera Director client will display the following warning:
Found warnings in cluster configuration: Unknown role type: 
GATEWAY for service type: SPARK2_ON_YARN in instance group
The warning is a false positive, but it does not stop the cluster creation.

Failure to update IP addresses in Cloudera Manager for repaired instances

When repairing instances, Cloudera Director will try to correlate instances known to Cloudera Director with instances known to Cloudera Manager. The correlation is done via the instance's IP address. However if the instance is terminated, the IP address known to Cloudera Director will be a placeholder while Cloudera Manager keeps its original IP address, resulting in failure when attempting to establish the mapping.

Cloudera Director continues trying to update a cluster even if the cluster was terminated in the middle of the update

When updating a cluster, Cloudera Director will first check the cluster status, and compute update steps if it passes the check, and start pipelines to update the cluster with the update steps. However, the check and launching pipelines is not transactional, so if a terminate cluster request is fulfilled in the middle, the update will still kick off the pipeline.

Update fails if 'Redeploy Client Configuration' is checked unnecessarily

Update fails when the Redeploy Client Configuration checkbox on the Modify Cluster page is checked and redeployment of client configurations is not needed.

Workaround: Do not check the Redeploy Client Configuration checkbox if redeployment of the client configuration is not needed.

Cluster bootstrap fails when Spot instance capacity is exceeded

If the number of Spot instances requested exceeds the user's Spot capacity, then the allocation of the Spot instances will fail and cause cluster bootstrap to fail.

Workaround: Ensure your Spot instance limit on your EC2 account is sufficient for the number of instances you request in the specified region.

Cloudera Director may leak instances in AWS

Cloudera Director may retry a failed instance allocation, resulting in two instances tagged with the same ID. Due to the tagging, Cloudera Director may terminate only one of the instances.

Cloudera Director client does not support unicode

HOCON substitution in Cloudera Director configurations is not supported.

Workaround: Write configurations without substitutions.

Block volume limit error reporting

If the EBS volume limit is reached when creating a cluster, the Cloudera Director log might not reflect this root cause, though it might mention creating the cluster failed because it cannot satisfy the minimum threshold limit for specific roles in the cluster.

Issues Fixed in Cloudera Director 2.4.1

Errors when using MySQL 5.7 for the Cloudera Director database

The defaults related to TIMESTAMP field handling changed drastically in MySQL 5.7.4 and later, which is documented in SQL Mode Changes in MySQL 5.7 in the MySQL documentation. One of the tables created by Cloudera Director, SERVER_CONFIGS, conflicts with these new defaults, which were valid in previous versions of MySQL. This is further complicated by the fact that MySQL 5.7.x will allow upgrades from MySQL 5.6 with tables that violate these defaults.

The result is that any modifications attempted on the SERVER_CONFIGS table in MySQL 5.7.x will fail. Cloudera Director 2.4 introduced a change to this table, triggering this problem. Additionally, new installs have been observed failing on MySQL 5.7.x due to the SERVER_CONFIGS table violating the expected defaults.

This issue has been fixed in Cloudera Director 2.4.1 with database changes that:
  • Adjust the creation of the SERVER_CONFIGS table on new installations
  • Correct SERVER_CONFIGS for users upgrading to Cloudera Director 2.4
  • Correct SERVER_CONFIGS for users who have already upgraded to Cloudera Director 2.4
For fresh installations on MySQL 5.7.x, this may affect any version of Cloudera Director starting with version 2.0. For existing installations that are now running on MySQL 5.7.x, this may affect users attempting to upgrade to Cloudera Director 2.4 from Cloudera Director versions 2.0 to 2.3. Running on MySQL 5.5.x or 5.6.x will behave as expected without any database failures.
Workarounds:
  • Cloudera recommends contacting Cloudera Support in order to fix this issue. However, if that is not an option, the following steps can be used to address the issue.
  • For a fresh install of Cloudera Director, the simplest workaround is to disable strict mode on MySQL. For more information about strict mode and how to disable it, see SQL Server Modes in the MySQL documentation. Using Cloudera Director 2.4.1 will avoid this issue.
  • For existing installs, manually modify the MySQL database to avoid this issue:
    • Upgrading from versions lower than 2.0.0: In this case, Cloudera Director will fail when trying to create the SERVER_CONFIGS table. In the database housing the Cloudera Director tables, examine the core_schema_version table and remove the line with the script value V3_2.0.0_1__init_serverconfig.sql.
      delete from core_schema_version where script = 'V3_2.0.0_1__init_serverconfig.sql';
      You should see a response like the following:
      Query OK, 1 row affected (0.02 sec)
      After this, retry the upgrade using Cloudera Director 2.4.1, or disable strict mode.
    • Upgrading from versions 2.0.0 to 2.4.0: In this case, Cloudera Director will fail when trying to modify several tables to remove the VERSION column. You must complete the migration manually and fix the TIMESTAMP issue.
      ALTER TABLE SERVER_CONFIGS MODIFY UPDATED_AT TIMESTAMP NULL, MODIFY CREATED_AT TIMESTAMP NULL;
      
      ALTER TABLE AUTHORITIES DROP COLUMN VERSION;
      ALTER TABLE CLUSTERS DROP COLUMN VERSION;
      ALTER TABLE DEPLOYMENTS DROP COLUMN VERSION; 
      ALTER TABLE ENVIRONMENTS DROP COLUMN VERSION;
      ALTER TABLE EXTERNAL_DATABASE_SERVERS DROP COLUMN VERSION;
      ALTER TABLE INSTANCE_TEMPLATES DROP COLUMN VERSION;
      ALTER TABLE SERVER_CONFIGS DROP COLUMN VERSION;
      ALTER TABLE USERS DROP COLUMN VERSION;
      
      UPDATE core_schema_version set success = 1 where script = 'V3_2.4.0_1__remove_versions.sql' 
      One or more of the ALTER TABLE statements may fail with an error that looks like the following:
      ERROR 1091 (42000): Can't DROP 'VERSION'; check that column/key exists
      This can be ignored because it was correctly deleted as part of the initial attempt to upgrade to Cloudera Director 2.4.

      After this, retry the migration. Cloudera recommends upgrading to Cloudera Director 2.4.1 as soon as possible, although these manual corrections should alleviate the issue.

Bootstrap fails because of empty parcel list

Cloudera Director fails in the middle of bootstrap with IllegalArgumentException: Parcel validation failed. This can happen when Cloudera Manager instances take longer than usual to refresh the list of parcels.

Unhealthy host causes "apply host template" to fail when growing the cluster

When growing an existing cluster the update operation may fail to add the instances. If the server log indicates "API call to Cloudera Manager failed. Method=HostTemplatesResource.applyHostTemplate," the user should enable Cloudera Manager API Debugging and check the server logs in Cloudera Manager to get more information on the failure. See Cloudera Manager API Call Fails in Troubleshooting Cloudera Director for information about checking Cloudera Manager logs.

One of the reasons for failure could be that the CDH parcel wasn't activated by the time Cloudera Director attempted to apply the host template. This specific scenario is likely to happen if newly added instances show up as unhealthy in Cloudera Manager.

Workaround: The best course of action is to try and figure out why the newly added instances comes up as unhealthy. This can sometimes be fixed by using a different AMI or instance type. If that doesn't work, Cloudera Director's lp.update.sleepTimeForAddInstanceSeconds server property can be increased to add additional time for the host to come back as healthy so that parcel gets distributed and activated before the API call to apply host template.

Azure VMs with manually attached Public IPs from different Resource Groups are marked as "not found"

An Azure VM with a manually attached Public IP from a different Resource Group will no longer be marked as "not found" and excluded from the the list of active cluster nodes. As of 2.4.1 the VMs will not report Public IPs from different Resource Groups but they will function as expected otherwise.

Workaround: Create the Public IP for manually attaching to the VM in the same Resource Group as the VM itself.

NullPointerException when Cloudera Director retrieves the private FQDN of a VM instance

In some rare cases the OS Profile metadata of an Azure VM can be empty. This can be confirmed by inspecting the VM metadata on Azure Resource Explorer ("osProfile" JSON block will be missing from the VM properties block). The OS Profile contains information such as the VM's private FQDN. An empty OS Profile can be related to Azure VM agents not running correctly on the VM. Cloudera recommends contacting Microsoft Azure support to resolve the issue where OS Profile is empty for an Azure VM. As of Cloudera Director 2.4.1, VM with missing OS Profile will no longer cause NullPointerException.

Add D series instances to Azure instance type list

The following instance types are added to to the Azure instance type list:

  • STANDARD_D15_v2
  • STANDARD_D14_v2
  • STANDARD_D13_v2
  • STANDARD_D12_v2

Expand Error Log for Unsupported Azure VMs

When deploying an unsupported Azure VM type the error message now contains actionable information for how to get and use the latest supported VM types.

Shorten Azure VMs Instance ID field in Cloudera Director UI

Azure VMs in Cloudera Director reported their instance IDs as a full Resource ID with Subscription ID and Resource Group name included. As of 2.4.1 the instance ID field is shortened to just the VM name.

Use static private IP address assignment option instead of dynamic

To guarantee the private IP address does not change after the VM is deallocated and restarted, the private IP allocation method must be Static. As of 2.4.1 the default private IP address allocation method is changed to Static.

Workaround: Manually change the private IP address assignment option to "Static" for each VM in the cluster via Azure portal.

Issues Fixed in Cloudera Director 2.4.0

Root password for external database server emitted in log

Cloudera Director logs the command line it runs to create new databases for Cloudera Manager and for cluster services. As of version 2.3, the password for the database being created was redacted in these log messages, but the password for the root account of the database server was not. This is fixed in 2.4, and the root password is now also redacted.

Cloudera Director may show the status of a cluster as TERMINATE_FAILED even when it has successfully terminated

If a cluster is terminated while in the process of bootstrapping, it is possible for the cluster to show TERMINATE_FAILED even though it has successfully terminated.

Cloudera Director does not sync with changes made in Cloudera Manager

Modifying a cluster in Cloudera Manager after it is bootstrapped does not cause the cluster state to be synchronized with Cloudera Director. Services that have been added or removed in Cloudera Manager do not show up in Cloudera Director when growing the cluster. For more information on keeping Cloudera Director and Cloudera Manager in sync, see CDH Cluster Management Tasks.

Old pipeline records not evicted from the Cloudera Director database

Cloudera Director records data about its internal workflow pipelines in its own database. Persisting this information allows Cloudera Director to track pipeline progress across restarts and to resume pipelines that were running or suspended. Pipeline data for old pipelines, such as those that have completed or failed, is automatically evicted from this database. However, under some circumstances, old pipeline data would fail to be evicted, resulting in logged errors. One cause is a Cloudera Director restart, which destroys in-memory pipeline data that was erroneously expected to remain. Cloudera Director 2.4 is more robust and eliminates this cause of pipeline eviction failure.

Workaround: In Cloudera Director 2.3 and below, the inability to evict old pipeline data does not harm Cloudera Director functioning in the short term, but over time the database could grow unacceptably large. Contact Cloudera Support for assistance deleting pipelines that cannot be evicted normally. To prevent build-up of old pipeline data, do not stop Cloudera Director until a round of database eviction is complete.

Delete deployment may orphan underlying clusters

When deleting deployments, it is possible that Cloudera Director deletes a deployment successfully, but leaves the cluster in an undeleted state. Retrying deployment deletion will not help, and the clusters will be orphaned. This is fixed in 2.4 such that a deployment deletion will also check for any orphaned clusters, even if the deployment itself is deleted.

Workaround: In Cloudera Director 2.3 and below, individually delete orphaned clusters if there ID's are known.

Bootstrap fails with non-default password-protected parcel repository

Bootstrap fails when using a password-protected CDH parcel repository with Cloudera Director 2.3 and below. This has been corrected in Cloudera Director 2.4.

Cloudera Director bootstrap hangs if EC2 spot instances terminate immediately after fulfillment

With Cloudera Director 2.3 and below, bootstrap can hang if spot instances terminate immediately after fulfillment, making it necessary to cancel the cluster bootstrap, terminate the cluster, and try again. This has been corrected in Cloudera Director 2.4 such that bootstrap fails immediately.

NullPointerException thrown when creating an invalid environment on Azure

In Cloudera Director 2.3.0 and below, a NullPointerException is thrown when invalid Microsoft Azure environment information (Subscription ID, Tenant ID, Client ID or Client Secret) is used in creating a new Azure Environment. For Cloudera Director 2.4.0 and higher, an error message is shown indicating that invalid Azure environment information was used to create the new Azure environment.

Terminated host not properly cleaned up during shrink or repair

When shrinking or repairing an instance that has been terminated outside of Cloudera Director, Cloudera Director may fail to decommission and delete the host from Cloudera Manager.

Terminated EC2 instances report 127.0.0.1 as private IP

AWS instances that were terminated outside of Cloudera Director may have reported an IP address of 127.0.0.1. This has been changed in Cloudera Director 2.4 so that the IP address 192.0.2.1 is reported (an IP address reserved for documentation).

Cloudera Director client infinitely tries to create services if you specify duplicate services

If duplicate services are specified for a cluster (for example, two Hive services or two Impala services), Cloudera Director will infinitely retry to create services during cluster creation.

Workaround: Cancel the cluster bootstrap, terminate the cluster, and recreate without duplicate services.

Cloudera Director may not apply custom configuration to all instances

Cloudera Director requests that Cloudera Manager perform automatic configuration for a cluster prior to applying any custom configurations. Automatic configuration may sometimes create multiple groups of instances within Cloudera Manager for a single corresponding group requested by Cloudera Director. When this occurs, custom configurations for the instances will only be applied to instances in one of the Cloudera Manager groups.

Creation of a cluster where instance groups have no roles is not possible using the web UI

Cloudera Director's web UI does not allow creation of clusters with instance groups that should not have CDH roles deployed on them.

Modification of a cluster where instance groups have no roles is not possible using the web UI

Cloudera Director's web UI does not allow modification of clusters with instance groups that should not have CDH roles deployed on them, even if they were created using the API.

Cluster launch fails using the development version of Cloudera Manager 5.10 and CDH 5.10 with Kudu

Cloudera Director 2.3 does not support deployment and management of Kudu.

If a cluster is terminated while it is bootstrapping, the cluster must be terminated again to complete the termination process

Terminating a cluster that is bootstrapping stops ongoing processes but keeps the cluster in the bootstrapping phase.

Severity: Low

Issues Fixed in Cloudera Director 2.3.0

Deployment bootstrap process may fail to complete

The process of bootstrapping a deployment can hang indefinitely waiting for Cloudera Manager to start, even after Cloudera Manager is up and reachable.

Cloudera Director does not install the JDBC driver for an existing MySQL database

Cloudera Director automatically installs JDBC drivers on an instance for Cloudera Manager and the CDH clusters it provisions. However, when you use an existing MySQL database with Cloudera Manager, Cloudera Director does not install the JDBC driver, which can result in database connection failures.

External databases are not configured for Hue and Oozie

External databases are not configured for Hue and Oozie in clusters created through the Cloudera Director web UI.

Normalization process does not set swappiness correctly on RHEL 7.2

On CentOS/RHEL 7 operating systems, the tuned service overwrites the swappiness settings that Cloudera Director configures on instances.

Stale service configs

Cloudera Director sometimes fails to detect stale services properly when restarting a cluster.

The nscd tool is installed but not enabled during normalization

nscd, a tool which caches common name service requests, is installed on Cloudera Director-managed instances, but is not enabled on CentOS and RHEL. This can reduce the performance of the bootstrapping process.

Cluster update or termination during instance metadata refresh fails to complete

If a deployment or cluster is terminated or updated at the same time that a refresh of instance metadata is running, on rare occasions the refresh will prevent the terminate or update operation from completing properly.

Director detects SRIOV incorrectly

For AWS instances, Cloudera Director will always report Enhanced Networking (SR-IOV) as false (for example on the instance properties page), even when it's enabled. This is fixed in Cloudera Director 2.3 and requires IAM permissions for the EC2 method DescribeInstanceAttribute.

After Cloudera Manager bootstrap failure, termination leads to renewed bootstrap attempt

In Cloudera Director 2.2, if you attempt to terminate a cluster or deployment in the BOOTSTRAP_FAILED stage, it may go back into the BOOTSTRAPPING stage and return the following exception message: java.util.concurrent.TimeoutException: Pipeline did not complete in 10 SECONDS. In this situation, terminating the deployment or cluster a second time should terminate the cluster or deployment as expected. This can also happen in Cloudera Director 2.1, but the exception message will be the following more generic message: 500 internal server error.

Warning when adding Hue Load Balancer role

When you bootstrap or validate a cluster that has the HUE_LOAD_BALANCER role, Cloudera Director generates an unknown role type warning for the role.

Bootstrap failure with Kafka and Sentry on Cloudera Manager 5.9

Cluster bootstrap fails when using Cloudera Manager 5.9 with both Kafka 2.0 and Sentry.

If Kafka and Sentry are required on the same cluster, use one of the following combinations:
  • Kafka 2.1 with Cloudera Manager 5.9 or 5.10
  • Kafka 2.0 with Cloudera Manager 5.8 or lower

Lack of support for newer AWS regions

When selecting certain AWS regions, such as ap-northeast-2, an error message can appear stating Unable to find the region ap-northeast-2. In this case, manually set the KMS region endpoint (under Advanced Options) to the KMS region endpoint specified in the AWS Regions and Endpoints in the AWS documentation.

Cloudera Manager repository URL validation failure

The validation of the Cloudera Manager repository can fail during the bootstrap process if the URL uses a host like localhost, a single-word hostname, or one with an internal or non-standard domain name. Use an IP address for the host, or use a hostname with a common domain like .com.

Cloudera Director configures Hue to use SQLite

CDH 5.8 and higher installs Postgres drivers along with Hue. When configuring a cluster to use Cloudera Manager's embedded Postgres database, Director will configure Hue to use its own embedded SQLite database rather than Cloudera Manager's embedded Postgres database.

MySQL database creation fails with insufficiently strong password

When using MySQL 5.7 as an external database server for a Cloudera Director deployment or cluster, database creation may fail with an error: "Your password does not satisfy the current policy requirements." This is due to Cloudera Director generating random UUIDs for passwords, which do not satisfy the MEDIUM level of password validation in MySQL 5.7. Disable password validation in MySQL, or adjust the validation level to LOW.

RDS instance creation fails with password length violation

AWS RDS requires a master user password of at least eight characters. If a password is supplied that is too short, Cloudera Director fails to validate it, leading to a failure from RDS. Ensure that the master user password is at least eight characters long.

Cloudera Manager server logs in Diagnostic data may be empty

Cloudera Director automatically attempts to collect diagnostic data after cluster bootstrap failure. If cluster bootstrap failed before or just after the cluster is created in Cloudera Manager, then the scm-server-logs inside the diagnostic data may be empty. In this case, trigger diagnostic data collection on the deployment.

High Azure Standard Storage Disk Usage

Azure Standard Disks are billed for used space + transactions (see Azure Storage Standard Disk Pricing). In Cloudera Director 2.2, Standard Storage Virtual Hard Disks (VHDs) are mounted without the "discard" option. As a result, if a file is deleted on the VHD it does not release this space back to Azure Standard Storage and it will continue to be billed as used space. Note: this issue does not cause disk space leakage; space occupied by deleted files can still be used by new files.

To address this problem, edit the prepare_unmounted_volumes file to add the discard mount option. prepare_unmounted_volumes is located at /var/lib/cloudera-director-plugins/azure-provider-1.1.0/etc/.

Change line 78 from:

echo "UUID=${blockid} $mount $FS defaults,noatime 0 0" >> /etc/fstab

to

echo "UUID=${blockid} $mount $FS defaults,noatime,discard 0 0" >> /etc/fstab

Restart the Cloudera Director server service after making this change.

Java Clients return null for a 404 ("Not Found") response

The Java client currently can return null values for both 204 and 404 response codes from the collectDiagnosticData service endpoint. Therefore, it is difficult to tell if a collection call fails because a deployment or cluster is missing. In this case, poll for the status for a finite amount of time. If the poll times out, consider the collection attempt failed.

Incorrect choice of response code for cluster update failure

An API request to update a cluster fails if the cluster is in transition, for example, if it is already being updated. The response code for the failure, however, is 204, which indicates success.

Environments may not be able to be deleted temporarily

Even when an environment is empty, that is, all of its deployments and external databases have been deleted, it can take five to ten minutes before it is possible to delete the environment. This is due to remaining data structures that have not yet been automatically cleaned up.

SELinux remains enabled on instances allocated by Director

Depending on the operating system, Cloudera Director may misread the state of SELinux on instances it allocates and determine that it is disabled, when it is actually still enabled. This can lead to errors running Cloudera Manager or cluster services.

Security group validation should be configurable

This change provides a new capability for end users to enforce network requirements. It allows users to define the network rules configuration and validates AWS security groups against the pre-defined rules. When writing rules, users can not only define allowed networking traffic, but also deny traffic against specific ports from a list of IP ranges.

Time daemons do not run properly on RHEL and CentOS 7.x instances

The choice of standard time daemon for RHEL and CentOS 7.x releases has changed from ntpd to chronyd. However, Cloudera Director does not perform the correct commands when normalizing instances to properly set up chronyd. Instances may end up with ntpd running, or no time daemon running at all. To avoid this, rely on ntpd for time synchronization, and use an instance bootstrap script to disable chronyd and enable ntpd. For more information, see Configuring NTP Using NTPD in the Red Hat Linux 7 System Administrator's Guide.

Issues Fixed in Cloudera Director 2.2.0

Storage Encryption for AWS RDS Instances

Before Cloudera Director 2.2, storage encryption for AWS RDS instances was not supported, despite the presence of a KMS key ID field in the web UI form for describing RDS instances. The web UI field was ignored. In Cloudera Director 2.2, storage encryption is supported, using the default key ID associated with RDS for the AWS account. Use of a non-default KMS key is not supported, and the KMS key ID field has been removed from the web UI. See Defining External Database Servers for information on enabling storage encryption for a new RDS instance.

Cannot update environment credentials of environments deployed on Microsoft Azure

With Cloudera Director on Microsoft Azure, the Update Environment Credentials web UI displays only some properties, and does not display all the properties required for the update.

Azure operation timeout

Some Azure operations, such as VM creation and deletion, can take longer to complete than the default timeout value of 20 minutes. When this occurs, the Cloudera Director Azure plugin will timeout the Azure operation, resulting in a failure to complete the operation. Adjusting the Cloudera Director server timeout does not help.

Wait until Azure operation time drops back to normal range (less than 20 minutes).

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the user can change the timeout value for Azure if the default value of 20 minutes is not long enough.

Deployment fails on Azure due to incompatible instance type existing in an Availability Set

VM creation fails if the VM of one series (for example, DS13) is deployed into an Azure Availability Set that already contains one or more VMs from a different series (for example, DS13_V2). This is an Azure platform restriction.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, an error is reported when an instance template is created that will cause a VM to be deployed into an incompatible Availability Set.

Add check to make sure resources are in the same region

VM creation fails when using resources from one region (for example, a VNET in EastUS) to deploy a VM in another region (for example, WestUS). This is an invalid configuration yet it may not be obvious when configuring an instance template.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, an error will be shown if the user tries to configure an instance template with resources from a different region than what is defined at the environment level.

Some valid host FQDN suffixes are not allowed in the Azure instance template

The regex check for the host FQDN suffix (DNS domain on the private cluster network) does not allow valid host FQDN with fewer than three characters. For example, company.us is not allowed.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the check for host FQDN has been relaxed to allow names like company.us or company.1.us.

Merge user-provided image configuration files with internal ones

Updating a Cloudera Director Azure plugin configuration file (images.conf) requires replacing the entire configuration file, even if only part of the configuration file needs to be updated.

Affected Versions: Cloudera Director 2.1.0, 2.1.1. Beginning in Cloudera Director 2.2.0, the user can provide partial Azure plugin configuration files containing only the portions to be updated.

Issues Fixed in Cloudera Director 2.1.1

Cloudera Director cannot connect to restarted VMs on Azure

Restarted VMs on Microsoft Azure are sometimes assigned a new IP address. This causes the cached IP address in Cloudera Director to become stale, so that Cloudera Director is unable to connect to the VMs.

Affected Version: Cloudera Director 2.1.0.

Public IP attached to a VM on Azure is deleted when the VM is deleted

Any public IP attached to a VM is deleted when the VM is deleted, even if that public IP was not created by the plugin.

Affected Version: Cloudera Director 2.1.0.

Cloudera Director web UI handles errors incorrectly with failed instance template validation on Azure

When the Microsoft Azure subscription permissions are not properly set up, an unexpected error can occur, causing instance template validation to exit. This error is not properly displayed in the Cloudera Director web UI.

Affected Version: Cloudera Director 2.1.0.

Resource name cannot contain special characters

A deployment may fail if the compute resource group used for Azure deployment contains special characters such as an underscore (_). Resource group names are sometimes used in the construction of resource names, causing deployments to fail if the resource group names contain special characters, because the naming restrictions are different for resource group names and resource names.

Affected Version: Cloudera Director 2.1.0.

Bootstrapping of clusters may fail if configured to not associate public IP addresses with EC2 instances

When using AWS, if the user deselects the Associate public IP addresses checkbox, instructing Cloudera Director to not assign public IP addresses to the EC2 instances it creates, Cloudera Director incorrectly interprets the missing public IP address of each instance as localhost (the Cloudera Director instance itself). Under certain conditions, this can lead to a variety of errors, including bootstrap failures and corruption of the Cloudera Director instance.

Affected Version: Cloudera Director 2.1.0.

Database server password fails if it contains special characters

Cloudera Director server does not handle special characters properly in database server admin/root passwords.

Update Cloudera Manager Credentials fails in certain scenarios

Cloudera Director erroneously rejects the credentials update as an unsupported modification if sensitive fields are configured on the deployment. The sensitive fields include license, billingId, and krbAdminPassword.

Cloudera Director server fails to start after upgrade under some circumstances

During an upgrade, Cloudera Director expects the Cloudera Manager instances it has deployed to match the instance template that was used while bootstrapping those instances. If the instance was modified out of band of Cloudera Director, then the server fails to start. An example of a mismatch is if the instance type of the Cloudera Manager instance was modified from within the cloud provider console.

Cluster bootstrap fails with high task parallelism

For high values of lp.bootstrap.parallelBatchSize, Cloudera Director fails to bootstrap clusters and throws an exception indicating that it failed to write intermediate state to the database. The default value of lp.bootstrap.parallelBatchSize is 20. lp.bootstrap.parallelBatchSize controls how many operations Cloudera Director should do in parallel while configuring a cluster.

Modifying a cluster can leave some roles marked as stale in Cloudera Manager

When growing or shrinking a cluster, you are presented with the option of restarting the cluster. The restart operation should only restart roles that are marked stale by Cloudera Manager, that is, only roles that need to be restarted. This optimization serves to minimize cluster downtime. However, with Cloudera Director 2.1.x, some stale roles might not be restarted, even though the Restart Cluster option is selected.

Default memory autoconfiguration for monitoring services may be suboptimal

Depending on the size of your cluster and your instance types, you may need to manually increase the memory limits for the Host Monitor and Service Monitor. Cloudera Manager displays a configuration validation warning or error if the memory limits are insufficient.

Issues Fixed in Cloudera Director 2.1.0

Validation error after initial setup with high availability

When you set up HDFS high availability using Cloudera Director, the secondary NameNode is not configured, because it is not required for high availability. Because of a Cloudera Manager bug, the absence of a secondary NameNode causes an erroneous validation error to appear in Cloudera Manager in HDFS > Configuration > HDFS Checkpoint Directories.

Repository or parcel URLs with internal domain names fail validation

Repository or parcel URLs fail validation in Cloudera Director when they are specified with internal domain names.

Database-related error when running Cloudera Director CLI after upgrade

When run after upgrade, the Cloudera Director CLI performs steps to upgrade its local database from the previous version. It can report an error:
Referential integrity management for DEFAULT not implemented.

Cloudera Director Does Not Recognize Cloudera Manager Password Changes

Cloudera Director does not recognize changes in the admin password in Cloudera Manager unless the username associated with the new password is also changed.

Incorrect yum repo definitions for Google Compute Engine RHEL images

The default RHEL 6 image defined in director-google-plugin version 1.0.1 and lower has an incorrect yum repo definition. This causes yum commands to fail after yum caches are cleared. See the Google Compute Engine issue tracker for issue details.

Long version string required for Kafka

Kafka requires a nonintuitive version string to be specified in the configuration file or web UI.

Issues Fixed in Cloudera Director 2.0.0

Cloning and growing a Kerberos-enabled cluster fails

Cloning of a cluster that uses Kerberos authentication fails, whether it is cloned manually or by using the kerberize-cluster.py script. Growing a cluster that uses Kerberos authentication fails.

Kafka with Cloudera Manager 5.4 and lower causes failure

Kafka installed with Cloudera Manager 5.4 and lower causes the Cloudera Manager installation wizard, and therefore the bootstrap process, to fail, unless you override the configuration setting broker_max_heap_size.

Cloudera Director does not set up external databases for Oozie and Hue

Cloudera Director cannot set up external databases for Oozie and Hue.

Issues Fixed in Cloudera Director 1.5.2

Apache Commons Collections deserialization vulnerability

Cloudera has learned of a potential security vulnerability in a third-party library called the Apache Commons Collections. This library is used in products distributed and supported by Cloudera (“Cloudera Products”), including Cloudera Director. At this time, no specific attack vector for this vulnerability has been identified as present in Cloudera Products.

The Apache Commons Collections potential security vulnerability is titled “Arbitrary remote code execution with InvokerTransformer” and is tracked by COLLECTIONS-580. MITRE has not issued a CVE, but related CVE-2015-4852 has been filed for the vulnerability. CERT has issued Vulnerability Note #576313 for this issue.

Releases affected: Cloudera Director 1.5.1 and lower, CDH 5.5.0, CDH 5.4.8 and lower, Cloudera Manager 5.5.0, Cloudera Manager 5.4.8 and lower, Cloudera Navigator 2.4.0, and Cloudera Navigator 2.3.8 and lower

Users affected: All

Severity (Low/Medium/High): High

Impact: This potential vulnerability may enable an attacker to run arbitrary code from a remote machine without requiring authentication.

Immediate action required: Upgrade to Cloudera Director 1.5.2, Cloudera Manager 5.5.1, and CDH 5.5.1.

Serialization for complex nested types in Python API client

Serialization for complex nested types has been fixed in the Python API client.

Issues Fixed in Cloudera Director 1.5.1

Support for configuration keys containing special characters

Configuration file parsing has been updated to correctly support quoted configuration keys containing special characters such as colons and periods. This enables the usage of special characters in service and role type configurations, and in instance tag keys.

Issues Fixed in Cloudera Director 1.5.0

Growing clusters may fail when using a repository URL that only specifies major and minor versions

When using a Cloudera Manager package repository or CDH/parcel repository URL that only specifies the major or minor versions, Cloudera Director may incorrectly use the latest available version when trying to grow a cluster.

For Cloudera Manager: http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.3.3/

For CDH: http://archive.cloudera.com/cdh5/parcels/5.3.3/

Flume does not start automatically after first run

Although you can deploy Flume through Cloudera Director, you must start it manually using Cloudera Manager after Cloudera Director bootstraps the cluster.

Impala daemons attempt to connect over IPv6

Impala daemons attempt to connect over IPv6.

DNS queries occasionally time out with AWS VPN

DNS queries occasionally time out with AWS VPN.

Issues Fixed in Cloudera Director 1.1.3

Ensure accurate time on startup

Instance normalization has been improved to ensure that time is synchronized by Network Time Protocol (NTP) before bootstrapping, which improves cluster reliability and consistency.

Speed up ephemeral drive preparation

Instance drive preparation during the bootstrapping process was slow, especially for instances with many large ephemeral drives. Time required for this process has been reduced.

Fix typographical error in the virtualizationmappings.properties file

The d2 instance type d2.4xlarge was incorrectly entered into Cloudera Director as d3.4xlarge in virtualizationmappings.properties. This has been corrected.

Avoid upgrading preinstalled Cloudera Manager packages

Cloudera Director no longer upgrades preinstalled Cloudera Manager packages.

Issues Fixed in Cloudera Director 1.1.2

Parcel validation fails when using HTTP proxy

Parcel validation now works when configuring an HTTP proxy for Cloudera Director server, allowing correctly configured parcel repository URLs to be used as expected.

Unable to grow a cluster after upgrading Cloudera Director 1.0 to 1.1.0 or 1.1.1

Cloudera Director now sets up parcel repository URLs correctly when a cluster is modified.

Add support for d2 and c4 AWS instance types

Cloudera Director now includes support for new AWS instance types d2 and c4. Cloudera Director can be configured to use additional instance types at any point as they become available in AWS.

Issues Fixed in Cloudera Director 1.1.1

Service-level custom configurations are ignored

Restored the ability to have service-level custom configurations. Due to internal refactoring changes, it was no longer possible to override service-level configs.

The property customBannerText is ignored and not handled as a deprecated property

Restored the customBannerText configuration file property, which was removed during the internal refactoring work.

Fixed progress bar issues when a job fails

The web UI showed a progress bar even when a job had failed.

Updated IAM Help text on Add Environment page

The help text on the Add Environment page for Role-based keys should refer to AWS Identity and Access Management (IAM), not to AMI.

Add eu-central-1 to the region dropdown

The eu-central-1 region has been added to the region dropdown on the Add Environment page.

Gateway roles should assign YARN, HDFS, and Spark gateway roles

All available gateway roles, including YARN, HDFS, and Spark, should be deployed by default on the instance.

Spark on YARN should be shown on the Modify Cluster page

Spark on YARN did not appear in the list of services on the Modify Cluster page.