This is the documentation for Cloudera Manager 4.8.4.
Documentation for other versions is available at Cloudera Documentation.

Troubleshooting Installation and Upgrade Problems

For information on known issues, see Known Issues and Work Arounds in Cloudera Manager 4. The information in this section can guide you through:

Symptom

Problem

What to Do

"Failed to start server" reported by cloudera-manager-installer.bin. /var/log/cloudera-scm-server/ cloudera-scm-server.log contains a message beginning Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver... You may have SELinux enabled. You can disable SELinux by running
 sudo setenforce 0 
on the Cloudera Manager Server host. To disable it permanently, edit /etc/selinux/config.
Installation interrupted and installer won't restart. You need to do some manual cleanup. See Uninstalling Cloudera Manager.
Cloudera Manager Server fails to start. Server is configured to use a MySQL database to store information about service configuration. Tables may be configured with the ISAM engine. Make sure that the InnoDB engine is configured, not the MyISAM engine. To check what engine your tables are using, run the following command from the MySQL shell: mysql> show table status;
Agents fail to connect to server. Error 113 ('No route to host') in cloudera-scm-agent.log. You may have SELinux or iptables enabled. Check /var/log/cloudera-scm-server /cloudera-scm-server.log on the Server system and /var/log/cloudera-scm-agent /cloudera-scm-agent.log on the Agent system(s). Disable SELinux and iptables.
Some cluster hosts do not appear when you click Find Hosts in install or update wizard. You may have network connectivity problems.
  • Make sure all cluster hosts have SSH port 22 open.
  • Check other common causes of loss of connectivity such as firewalls and interference from SELinux.
"Access denied" in install or update wizard during database configuration for Activity Monitor, Report Manager, or Service Monitor. Hostname mapping or permissions are incorrectly set up.
  • For hostname configuration, see information on Configuring Network Names in the CDH3 Deployment on a Cluster topic in CDH3 Installation Guide.
  • For permissions, make sure the values you enter into the wizard match those you used when you configured the databases. For more information, see Checking Database Hostnames.
Activity Monitor, Report Manager, or Service Monitor databases fail to start. MySQL binlog format problem. Set binlog_format=mixed in /etc/my.cnf. For more information, see this MySQL bug report. See also Installing and Configuring Databases.
You have upgraded the Cloudera Manager Server to 4.5, but now cannot start services. You may have mismatched versions of the Cloudera Manager Server and Agents. Make sure you have upgraded the Cloudera Manager Agents on all host machines to 4.5. (The previous version of the Agents will heartbeat with the new version of the Server, but you can't start HDFS and MapReduce with this combination.)
Cloudera services fail to start. Java may not be installed or may be installed at a custom location. See Using Custom Java Home Locations for more information on resolving this issue.
The Service Monitor, Activity Monitor, or Host Monitor display a status of BAD in the Cloudera Manager Admin Console. The log file contains the following message: ERROR 1436 (HY000): Thread stack overrun: 7808 bytes used of a 131072 byte stack, and 128000 bytes needed. Use 'mysqld -O thread_stack=#' to specify a bigger stack. The MySQL thread stack is too small. Update the thread_stack value in my.cnf to 256KB. The my.cnf file is normally located in /etc or /etc/mysql.
Restart the mysql service:
$ sudo service mysql restart

Restart the failed service using the Cloudera Manager Admin Console.

The Service Monitor or Activity Monitor agents fail to start. Logs contain the error read-committed isolation not safe for the statement binlog format. The binlog_format is not set to mixed. Modify the mysql.cnf file to include the entry for binlog format as specified in Installing and Configuring a MySQL Database .
Attempts to reinstall older versions of CDH or Cloudera Manager using Yum fails. It is possible to install, uninstall, and reinstall CDH and Cloudera Manager. In certain cases, this does not complete as expected. If you install Cloudera Manager 4 and CDH 4, then uninstall Cloudera Manager and CDH, and then attempt to install CDH 3.7 and Cloudera Manager 3.7, incorrect cached information may result in the installation of an incompatible version of the Oracle JDK. To resolve this issue, you must clear information in the yum cache. Clear cache information as follows:

Connect to the CDH host.

Execute either of the following commands: $ yum --enablerepo='*'clean all or $ rm -rf /var/cache/yum/cloudera*

After clearing cache information, proceed with installing CDH 3.7 and Cloudera Manager 3.7.

Hive, Impala, or Hue complains about a missing table in the Hive Metastore database. The Hive Metastore database schema must be upgraded after a major Hive version change (Hive had a major version change in CDH 4.0, 4.1, 4.2, and 5.0). Follow the instructions in the CDH Installation Guide for upgrading the Hive Metastore database schema. Stop all Hive services before performing the upgrade.
The "Create Hive Metastore Database Tables" command fails due to problem with escape string. Postgres versions 9 and later require special configuration for Hive because of a backward-incompatible change in the default value of the standard_conforming_strings property. Versions up to PostgreSQL 9.0 defaulted to off, but starting with version 9.0 the default is on. As the administrator user, use the following command to turn standard_conforming_strings off:
ALTER DATABASE <hive_db_name> SET standard_conforming_strings = off; 

Checking Database Hostnames

The value you enter into the wizard as the database hostname must match the value you entered for the hostname (if any) when you configured the database.

For example, if you entered the following for the Activity Monitor database

grant all on activity_monitor.* TO 'amon_user'@'localhost' IDENTIFIED BY 'amon_password';

the value you enter here for the database hostname must be localhost. On the other hand, if you had entered the following when you created the database

grant all on activity_monitor.* TO 'amon_user'@'myhost1.myco.com' IDENTIFIED BY 'amon_password';

the value you enter here for the database hostname must be myhost1.myco.com. If you did not specify a host, or used a wildcard to allow access from any host, you can enter either the fully-qualified domain name (FQDN) here, or localhost. For example, if you entered

grant all on activity_monitor.* TO 'amon_user'@'%' IDENTIFIED BY 'amon_password';

the value you enter here for the database hostname can be either the FQDN or localhost. Similarly, if you entered

grant all on activity_monitor.* TO 'amon_user' IDENTIFIED BY 'amon_password';

the value you enter here for the database hostname can be either the FQDN or localhost.

Recovering from Cloudera Manager Host Failures

Cloudera Manager uses databases to store information about the Cloudera Manager system and jobs. If the machine hosting Cloudera Manager fails, it is possible to re-establish the installation if the database information is still available. Database information is typically available for either of the following reasons:

  • You backed up the database.
  • The database and Cloudera Manager are on separate servers and the database server is still available.

Before beginning this process, find the failed machine's name IP address and hostname. It is not absolutely necessary to have the old Cloudera Manager server name and IP address, but it simplifies the process. You could use a new IP address and hostnames, but this would require updating the configuration of every agent to use this new information. Because it is easier to use the old server name and address in most cases, using a new hostname and IP address is not described.

To restore a Cloudera Manager when the database server is available

  1. Identify a new server on which to install Cloudera Manager. Assign the failed Cloudera Manager server's IP address and hostname to the new server.
      Note:

    If the agents were configured with the server's hostname, you do not need to assign the old machine's IP address to the new host. Simply assigning the hostname will suffice.

  2. Install Cloudera Manager on a new server, using the method described under Step 3: Install the Cloudera Manager Server. Do not install the other components, such as CDH and databases, as those should still exist in your environment
  3. Update /etc/cloudera-scm-server/db.properties with the necessary information so Cloudera Manager server connects to the restored database. This information is typically the database name, database instance name, user name, and password.
  4. Start the Cloudera Manager server.

To restore a Cloudera Manager deployment from database backups when the database server is not available

  1. Identify a new server on which to install Cloudera Manager. Assign the failed Cloudera Manager server's IP address and hostname to the new server.
      Note:

    If the agents were configured with the server's hostname, you do not need to assign the old machine's IP address to the new host. Simply assigning the hostname will suffice.

  2. Install Cloudera Manager on a new server, using whatever method you used before, as described in Step 3: Install the Cloudera Manager Server.
  3. Install the database packages on the machines that will host the restored database. This could be the same server on which you have just installed Cloudera Manager or it could be a different server. The details of which package to install varies based on which database was initially installed on your system. If you used an external MySQL, PostgreSQL, or Oracle database, reinstall that now. If you used the embedded PostgreSQL database, you will need to install the cloudera-manager-server-db package as described in Installing an Embedded PostgreSQL Database. After installing that package, you must initialize and start the database as described in Configuring Your Systems to Support PostgreSQL.
  4. Restore the backed up databases to the new database installations.
  5. Update /etc/cloudera-scm-server/db.properties with the necessary information so Cloudera Manager server connects to the restored database. This information is typically the database name, database instance name, user name, and password.
  6. Start the Cloudera Manager server.

At this point, Cloudera Manager should resume functioning as it did before the failure. Because you restored the database from the backup, the server should accept the running state of the agents, meaning it will not terminate any running Hadoop processes.

This process is similar with secure clusters, though additional files in /etc/cloudera-scm-server must be restored in addition to the database.

Changing Embedded PostgreSQL Database Passwords

When Cloudera Manager installs and configures embedded PostgreSQL databases, it creates user accounts and passwords. You may wish to change passwords associated with the embedded PostgreSQL database accounts. To change these passwords, you must know what the original password was, but since the accounts were automatically created, this information is often unknown.

To achieve the goal of changing the password, you can retrieve the user name or password, as well as other database information.

  • The Cloudera Manager service connects to the database using the scm account. Information about this account is stored in the db.properties file.
  • The root account for the database is the cloudera-scm account. Information about this account is stored in the generated_password.txt file.

To find information about the PostgreSQL database user account that the SCM service uses, read the /etc/cloudera-scm-server/db.properties file:

# cat /etc/cloudera-scm-server/db.properties

Auto-generated by scm_prepare_database.sh
#
Sat Oct 1 12:19:15 PDT 201
#
com.cloudera.cmf.db.type=postgresql
com.cloudera.cmf.db.host=localhost:7432
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=TXqEESuhj5

To find information about the root account for the database, read the /var/lib/cloudera-scm-server-db/data/generated_password.txt file:

# cat /var/lib/cloudera-scm-server-db/data/generated_password.txt

MnPwGeWaip

The password above was generated by /usr/share/cmf/bin/initialize_embedded_db.sh (part of the cloudera-scm-server-db package)
and is the password for the user 'cloudera-scm' for the database in the current directory.

Generated at Fri Jun 29 16:25:43 PDT 2012.

Once you have gathered passwords, you can change the passwords for users, if desired.