Installing and Configuring a MySQL Database
You can use MySQL databases to store information for Cloudera Manager. Cloudera Manager monitoring services require databases to store information, so you will need to create databases for each of those entities.
You should also create a database for the Hive Metastore Server.
See your MySQL documentation for more information about installation and configuration.
To install MySQL on a Red Hat system:
$ sudo yum install mysql-server
To install MySQL on a SLES system:
$ sudo zypper install mysql $ sudo zypper install libmysqlclient_r15
Some SLES systems encounter errors when using the preceding zypper install command. For more information on resolving this issue, see Novell's Knowledgebase topic, error running chkconfig.
To install MySQL on an Debian/Ubuntu system:
$ sudo apt-get install mysql-server
After issuing the command to install MySQL, you may need to respond to prompts to confirm that you do want to complete the installation. After installation completes, start the mysql daemon.
On Red Hat systems
$ sudo service mysqld start
On SLES and Debian/Ubuntu systems
$ sudo service mysql start
Configuring MySQL for the Cloudera Manager Databases
The default settings in the MySQL installations in most distributions are very conservative with regards to buffer sizes and memory usage. In addition, Cloudera Manager requires InnoDB. Cloudera Management services need high write throughput as, based on cluster size, they may insert a lot of records in the database. Therefore Cloudera recommends that you set innodb_flush_method to O_DIRECT.
For the databases used by Cloudera Manager, Cloudera recommends that you update /etc/my.cnf or /etc/mysql/my.cnf to at least the values shown below. It is important that the datadir directory, which, by default, is /var/lib/mysql, is on a partition that has plentiful free space.
Determine the version of MySQL.Important
- For a fresh MySQL installation on Red Hat or SLES systems, apply the settings in the next step before you start MySQL for the first time.
- For an existing installation and a new installation on Ubuntu, you must perform the steps in Reconfiguring InnoDB Settings for an Existing MySQL Installationbefore changing InnoDB settings.
- Recommended MySQL configurations settings are as follows. You should incorporate these changes as appropriate into your configuration settings.
[mysqld] transaction-isolation=READ-COMMITTED # Disabling symbolic-links is recommended to prevent assorted security risks; # to do so, uncomment this line: # symbolic-links=0 key_buffer = 16M key_buffer_size = 32M max_allowed_packet = 16M thread_stack = 256K thread_cache_size = 64 query_cache_limit = 8M query_cache_size = 64M query_cache_type = 1 # Important: see Configuring the Databases and Setting max_connections max_connections = 550 # log-bin should be on a disk with enough free space log-bin=/x/home/mysql/logs/binary/mysql_binary_log # For MySQL version 5.1.8 or later. Comment out binlog_format for older versions. binlog_format = mixed read_buffer_size = 2M read_rnd_buffer_size = 16M sort_buffer_size = 8M join_buffer_size = 8M # InnoDB settings innodb_file_per_table = 1 innodb_flush_log_at_trx_commit = 2 innodb_log_buffer_size = 64M innodb_buffer_pool_size = 4G innodb_thread_concurrency = 8 innodb_flush_method = O_DIRECT innodb_log_file_size = 512M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
Configuring the Databases and Setting max_connections
The definition of a small or large cluster is not absolute, so this information is intended as general guidance. For the purposes of this discussion, clusters with fewer than 50 hosts can be considered small clusters and clusters with more than 50 hosts can be considered large clusters.
Follow these guidelines:
- In a small cluster, you can store more than one database (for example, both the Activity Monitor and Service Monitor) on the same host. If you do this, you should:
- Put each database on its own storage volume.
- Allow 100 maximum connections for each database and then add 50 extra connections. For example, for two databases set the maximum connections to 250. If you store seven databases on one host (the databases for Activity Monitor, Service Monitor, Report Manager, Host Monitor, Cloudera Manager Server, Cloudera Navigator, and Hive Metastore), set the maximum connections to 750.
- In a large cluster, do not store more than one database on the same host. In such a case, use a separate host for each database/host pair. The hosts need not be reserved exclusively for databases, but each database should be on a separate host.
Reconfiguring InnoDB Settings for an Existing MySQL Installation
To update InnoDB settings on all hosts that are using an existing MySQL installation, proceed as follows.
- Stop MySQL.
- Red Hat
$ sudo service mysqld stop
- SLES and Debian/Ubuntu
$ sudo service mysql stop
- Edit the InnoDB entries in /etc/my.cnf as shown in the previous section.
- Move the old InnoDB log files to a backup location. The two files to move are /var/lib/mysql/ib_logfile0 and /var/lib/mysql/ib_logfile1. Make sure you move these files out of the /var/lib/mysql/ directory (don't copy them and leave the originals in place).
- Start MySQL.
- Red Hat
$ sudo service mysqld start
- SLES and Debian/Ubuntu
$ sudo service mysql start
- Red Hat
Installing the MySQL JDBC Connector
Install the JDBC connector on the Cloudera Manager Server host, as well as hosts to which you assign the Activity Monitor, Service Monitor, Report Manager, Host Monitor, Cloudera Navigator, and Hive Metastore Server roles.
- Red Hat 6
$ sudo yum install mysql-connector-java
- Red Hat 5
- Download the MySQL JDBC connector from http://www.mysql.com/downloads/connector/j/5.1.html.
- Extract the JDBC driver JAR file from the downloaded file; for example:
tar zxvf mysql-connector-java-5.1.18.tar.gz
- Add the JDBC driver, renamed, to the relevant server; for example:
$ sudo cp mysql-connector-java-5.1.18/mysql-connector-java-5.1.18-bin.jar /usr/share/java/mysql-connector-java.jar
If the target directory does not yet exist on this host, you can create it before copying the .jar file; for example:
$ sudo mkdir -p /usr/share/java/ $ sudo cp mysql-connector-java-5.1.18/mysql-connector-java-5.1.18-bin.jar /usr/share/java/mysql-connector-java.jar
$ sudo zypper install mysql-connector-java
$ sudo apt-get install libmysql-java
Configure MySQL to use a strong password and to start at boot.
- Set the MySQL root password. Note that in the following procedure, your current root password is blank. Press the Enter key when you're prompted for the root password.
$ sudo /usr/bin/mysql_secure_installation [...] Enter current password for root (enter for none): OK, successfully used password, moving on... [...] Set root password? [Y/n] y New password: Re-enter new password: Remove anonymous users? [Y/n] Y [...] Disallow root login remotely? [Y/n] N [...] Remove test database and access to it [Y/n] Y [...] Reload privilege tables now? [Y/n] Y All done!
- Ensure the MySQL server starts at boot.
- Red Hat
$ sudo /sbin/chkconfig mysqld on $ sudo /sbin/chkconfig --list mysqld mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
$ sudo chkconfig --add mysql
$ sudo chkconfig mysql on
- Red Hat
Creating the MySQL Databases for Cloudera Manager
The next step involves creating databases and user accounts for all database-backed services in Cloudera Manager.
Create databases for each of the following features that are part of the Cloudera Management Services:
- Activity Monitor
- Service Monitor
- Report Manager
- Host Monitor
- Cloudera Navigator (optional -- Cloudera Navigator is a separately-purchased add-on to Cloudera Enterprise)
In addition add a database for each Hive Metastore Server.
You can create these databases on the host where the Cloudera Manager Server will run, or on any other nodes in the cluster. For performance reasons, you should typically install each database on the host on which the service runs, as determined by the roles you will assign during installation or upgrade. In larger deployments or in cases where database administrators (DBAs) are managing the databases the services will use, databases may be separated from services, but do not undertake such an implementation lightly.
Note the values you enter for database names, user names, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.
The database must be configured to support UTF-8 character set encoding. The sample commands below include the required options to enable UTF-8 support.
- Log into MySQL as the root user:
$ mysql -u root -p Enter password:
- Create a database for the Activity Monitor. Note
:The database name, user name, and password can be anything you want. The examples shown match the default names provided in the Cloudera Manager Hive configuration settings.
mysql> create database amon DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon_password'; Query OK, 0 rows affected (0.00 sec)
- Create a database for the Service Monitor. For example:
mysql> create database smon DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'smon_password'; Query OK, 0 rows affected (0.00 sec)
- Create a database for the Report Manager. For example:
mysql> create database rman DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman_password'; Query OK, 0 rows affected (0.00 sec)
- Create a database for the Host Monitor. For example:
mysql> create database hmon DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'hmon_password'; Query OK, 0 rows affected (0.00 sec)
- (Optional) Create a database for Cloudera Navigator. For example:
mysql> create database nav DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav_password'; Query OK, 0 rows affected (0.00 sec)
Create the Database for the Hive Metastore and Impala Catalog Daemon
- Create a database for the Hive metastore. For example:
mysql> create database hive DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive_password'; Query OK, 0 rows affected (0.00 sec)
Backing Up the MySQL Databases
- Cloudera Manager database: Contains all the information about what services you have configured, their role assignments, all configuration history, commands, users, and running processes. This is a relatively small database (<100MB), and is the most important to back up.
- Activity Monitor database: Contains information about past activities. In large clusters, this database can grow large.
- Service Monitor database: Contains monitoring information about daemons. In large clusters, this database can grow large.
- Report Manager database: Keeps track of disk utilization over time. Medium-sized.
- Host Monitor database: Contains information about host status. In large clusters, this database can grow large.
- Cloudera Navigator database: Contains auditing information. In large clusters, this database can grow large.
- Hive Metastore database: Contains Hive metadata. Relatively small.
$ mysqldump -h<hostname> -u<username> -p<password> <database> > /tmp/<database-backup>.sql
For example, to back up the sample database created for the Activity Monitor above, amon on the local host as the root user, with the password mypasswd:
$ mysqldump -pmypasswd amon > /tmp/amon-backup.sql
To back up the sample Activity Monitor database amon on remote host myhost.example.com as the root user, with the password mypasswd:
$ mysqldump -hmyhost.example.com -uroot -pcloudera amon > /tmp/amon-backup.sql