To use an Oracle database, follow these procedures.
Collecting Oracle Database Information
To configure Cloudera Manager to work with Oracle databases, get the following information from your Oracle DBA:
- Host name - The DNS name or the IP address of the host where the Oracle database is installed.
- SID - The name of the database that will store Cloudera Manager information. This database could contain schema that would store information for the Cloudera Manager Server, Activity Monitor, Reports Manager, and Cloudera Navigator.
- User name - A user name for each schema that is storing information. You could have four unique usernames for the four schema.
- Password - A password corresponding to each user name.
You use this information to configure the external database to work with the Cloudera Manager Server.
Configuring the Oracle Server
Adjust Oracle Settings to Accommodate Larger Clusters
- Enable direct and asynchronous I/O by setting the FILESYSTEMIO_OPTIONS parameter to SETALL.
- Increase the RAM available to Oracle by changing the MEMORY_TARGET parameter. The amount of memory to assign depends on the size of the Hadoop cluster.
- Create more redo log groups and spread the redo log members across separate disks or logical unit numbers.
- Increase the size of redo log members to be at least 1 GB.
Modify the Maximum Number of Oracle Connections
Work with your Oracle database administrator to ensure appropriate values are applied for your Oracle database settings. You must determine the number of connections, transactions, and sessions to be allowed.
Allow 100 maximum connections for each database and then add 50 extra connections. For example, for two databases, set the maximum connections to 250. If you store five databases on one host (the databases for Cloudera Manager Server, Activity Monitor, Reports Manager, Cloudera Navigator, and Hive Metastore), set the maximum connections to 550.
From the maximum number of connections, you can determine the number of anticipated sessions using the following formula:
sessions = (1.1 * maximum_connections) + 5
For example, if a host has two databases, you anticipate 250 maximum connections. If you anticipate a maximum of 250 connections, plan for 280 sessions.
Once you know the number of sessions, you can determine the number of anticipated transactions using the following formula:
transactions = 1.1 * sessions
Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.
Work with your Oracle database administrator to apply these derived values to your system.
Using the sample values above, Oracle attributes would be set as follows:
alter system set processes=250; alter system set transactions=308; alter system set sessions=280;
Ensure Your Oracle Database Supports UTF8
The database you use must support UTF8 character set encoding. You can implement UTF8 character set encoding in Oracle databases by using the dbca utility. In this case, you can use the characterSet AL32UTF8 option to specify proper encoding. Consult your DBA to ensure UTF8 encoding is properly configured.
Installing the Oracle JDBC Connector
You must install the JDBC connector on the Cloudera Manager Server host and on hosts to which you assign the Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server server roles.
Cloudera recommends that you assign all roles that require databases on the same host and install the connector on that host. Locating all such roles on the same host is recommended but not required. If you install a role, such as Activity Monitor, on one host and other roles on a separate host, you would install the JDBC connector on each host running roles that access the database.
- Download and install the ojdbc6.jar file, which contains the JDBC driver. Download the
version that is designed for:
- Java 6
- The Oracle database version used in your environment. For example, for an environment using Oracle 11g R2, download the JAR file from http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html.
- Copy the appropriate JDBC JAR file to /usr/share/java/oracle-connector-java.jar for use with the
Cloudera Manager databases (for example, for the Activity Monitor, and so on),
and for use with Hive.
$ mkdir /usr/share/java (if necessary) $ cp /tmp/ojdbc6.jar /usr/share/java/oracle-connector-java.jar
Creating Databases for Activity Monitor, Reports Manager, Hive Metastore, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server
- If you are not using the Cloudera Manager installer, the Cloudera Manager Server.
- Cloudera Management Service roles:
- Activity Monitor (if using the MapReduce service)
- Reports Manager
- Each Hive metastore
- Sentry Server
- Cloudera Navigator Audit Server
- Cloudera Navigator Metadata Server
You can create these databases on the host where the Cloudera Manager Server will run, or on any other hosts in the cluster. For performance reasons, you should install each database on the host on which the service runs, as determined by the roles you assign during installation or upgrade. In larger deployments or in cases where database administrators are managing the databases the services use, you can separate databases from services, but use caution.
The database must be configured to support UTF-8 character set encoding.
Record the values you enter for database names, user names, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.