• Downloads
  • Training
  • Support Portal
  • Partners
  • Developers
  • Community
  • Search
  • Sign In
Cloudera
  • Sign In
  • Search
  • Products
  • Services & Support
  • Solutions

Cloudera Enterprise 5.14.x | Other versions

Documentation
View All Categories
  • Cloudera Introduction
    • CDH Overview
      • Hive Overview
      • Apache Impala Overview
      • Cloudera Search Overview
        • Understanding Cloudera Search
        • Cloudera Search and Other Cloudera Components
        • Cloudera Search Architecture
        • Cloudera Search Tasks and Processes
      • Apache Kudu Overview
      • Apache Sentry Overview
      • Apache Spark Overview
      • External Documentation
    • Cloudera Manager 5 Overview
      • Cloudera Manager Admin Console
        • Cloudera Manager Admin Console Home Page
        • Displaying Cloudera Manager Documentation
        • Automatic Logout
      • Cloudera Manager API
        • Using the Cloudera Manager API for Cluster Automation
      • Extending Cloudera Manager
    • Cloudera Navigator Data Management
      • Getting Started with Cloudera Navigator
      • Cloudera Navigator Frequently Asked Questions
    • Cloudera Navigator Encryption
      • Cloudera Navigator Key Trustee Server Overview
      • Cloudera Navigator Key HSM Overview
      • Cloudera Navigator HSM KMS Overview
      • Cloudera Navigator Encrypt Overview
    • Navigator Optimizer
    • Frequently Asked Questions About Cloudera Software
    • Getting Support
  • Cloudera Release Notes
  • Requirements and Supported Versions
  • Cloudera QuickStart VM
    • QuickStart VM Software Versions and Documentation
    • QuickStart VM Administrative Information
    • Cloudera Docker Container
  • Cloudera Manager
    • Cloudera Manager 5 Frequently Asked Questions
  • Cloudera Installation
    • Configuration Requirements for Cloudera Manager, Cloudera Navigator, and CDH 5
      • Permission Requirements for Package-based Installations and Upgrades of CDH
      • Cluster Hosts and Role Assignments
      • Required Tomcat Directories
      • Ports
        • Ports Used by Cloudera Manager and Cloudera Navigator
        • Ports Used by Cloudera Navigator Encryption
        • Ports Used by Components of CDH 5
        • Ports Used by Impala
        • Ports Used by Cloudera Search
        • Ports Used by DistCp
        • Ports Used by Third-Party Components
        • Ports Used by Apache Flume and Apache Solr
    • Managing Software Installation Using Cloudera Manager
      • Parcels
      • Creating Virtual Images of Cluster Hosts
      • Migrating from Packages to Parcels
      • Migrating from Parcels to Packages
    • Installing Cloudera Manager and CDH
      • Java Development Kit Installation
      • Configuring Single User Mode
      • Cloudera Manager and Managed Service Datastores
        • Embedded PostgreSQL Database
        • External PostgreSQL Database
        • MariaDB Database
        • MySQL Database
        • Oracle Database
        • Configuring an External Database for Oozie
        • Configuring an External Database for Sqoop
        • Backing Up Databases
        • Data Storage for Monitoring Data
        • Storage Space Planning for Cloudera Manager
      • Installation Path A - Automated Installation by Cloudera Manager (Non-Production Mode)
      • Installation Path B - Installation Using Cloudera Manager Parcels or Packages
        • (Optional) Manually Install CDH and Managed Service Packages
      • Installation Path C - Manual Installation Using Cloudera Manager Tarballs
      • Installing Impala
      • Installing Kudu
      • Installing Cloudera Search
      • Installing Spark
      • Installing the GPL Extras Parcel
      • Understanding Custom Installation Solutions
        • Creating and Using a Parcel Repository for Cloudera Manager
        • Creating and Using a Package Repository for Cloudera Manager
        • Configuring a Custom Java Home Location
        • Installing Lower Versions of Cloudera Manager 5
        • Creating a CDH Cluster Using a Cloudera Manager Template
      • Deploying Clients
      • Testing the Installation
      • Uninstalling Cloudera Manager and Managed Software
      • Uninstalling a CDH Component From a Single Host
      • Installing the Cloudera Navigator Data Management Component
      • Installing Cloudera Navigator Key Trustee Server
      • Installing Cloudera Navigator Key HSM
      • Installing Key Trustee KMS
      • Installing Navigator HSM KMS Backed by Thales HSM
      • Installing Navigator HSM KMS Backed by Luna HSM
      • Installing Cloudera Navigator Encrypt
    • Installing and Deploying CDH Using the Command Line
      • Before You Install CDH 5 on a Cluster
      • Creating a Local Yum Repository
      • Installing the Latest CDH 5 Release
      • Installing an Earlier CDH 5 Release
      • CDH 5 and MapReduce
      • Migrating from MapReduce (MRv1) to MapReduce (MRv2)
      • Deploying CDH 5 on a Cluster
        • Configuring Dependencies Before Deploying CDH on a Cluster
          • Enabling NTP
          • Configuring Network Names
          • Setting SELinux mode
          • Disabling the Firewall
        • Deploying HDFS on a Cluster
        • Deploying MapReduce v2 (YARN) on a Cluster
        • Deploying MapReduce v1 (MRv1) on a Cluster
        • Configuring Hadoop Daemons to Run at Startup
      • Installing CDH 5 Components
        • Crunch Installation
          • Crunch Prerequisites
          • Crunch Packaging
          • Installing and Upgrading Crunch
          • Crunch Documentation
        • Flume Installation
          • Upgrading Flume
          • Flume Packaging
          • Installing the Flume Tarball
          • Installing the Flume RPM or Debian Packages
          • Verifying the Flume Installation
        • HBase Installation
          • Installing HBase
          • Upgrading HBase
        • HCatalog Installation
          • HCatalog Prerequisites
          • Installing and Upgrading the HCatalog RPM or Debian Packages
          • Configuration Change on Hosts Used with HCatalog
          • Starting and Stopping the WebHCat REST server
          • Accessing Table Information with the HCatalog Command-line API
          • Accessing Table Data with MapReduce
          • Accessing Table Data with Pig
          • Accessing Table Information with REST
          • Viewing the HCatalog Documentation
        • Impala Installation
          • Requirements
          • Installing Impala from the Command Line
          • Upgrading Impala
          • Starting Impala
            • Modifying Impala Startup Options
        • Hive Installation
          • Installing Hive
          • Upgrading Hive
        • HttpFS Installation
          • About HttpFS
          • HttpFS Packaging
          • HttpFS Prerequisites
          • Installing HttpFS
          • Configuring HttpFS
          • Starting the HttpFS Server
          • Stopping the HttpFS Server
          • Using the HttpFS Server with curl
        • Hue Installation
          • Configuring CDH Components for Hue
          • Hue Configuration
        • KMS Installation and Upgrade
        • Kudu Installation
          • Upgrading Kudu
        • Mahout Installation
          • Installing Mahout
          • Upgrading Mahout
          • The Mahout Executable
          • Getting Started with Mahout
          • Viewing the Mahout Documentation
        • Oozie Installation
          • Oozie Packaging
          • Oozie Prerequisites
          • Installing Oozie
        • Pig Installation
          • Upgrading Pig
          • Installing Pig
          • Using Pig with HBase
          • Installing DataFu
          • Viewing the Pig Documentation
        • Search Installation
          • Installing Cloudera Search without Cloudera Manager
          • Installing the Spark Indexer
          • Installing MapReduce Tools for use with Cloudera Search
          • Installing the Lily HBase Indexer Service
          • Upgrading Cloudera Search
          • Installing Hue Search
            • Updating Hue Search
        • Sentry Installation
        • Snappy Installation
        • Spark Installation
          • Spark Packages
          • Spark Prerequisites
          • Installing and Upgrading Spark
        • Sqoop 1 Installation
        • Sqoop 2 Installation
          • Upgrading Sqoop 2 from an Earlier CDH 5 Release
          • Installing Sqoop 2
          • Configuring Sqoop 2
          • Starting, Stopping, and Accessing the Sqoop 2 Server
          • Viewing the Sqoop 2 Documentation
          • Feature Differences - Sqoop 1 and Sqoop 2
        • Whirr Installation
          • Upgrading Whirr
          • Installing Whirr
          • Generating an SSH Key Pair for Whirr
          • Defining a Whirr Cluster
          • Managing a Cluster with Whirr
          • Viewing the Whirr Documentation
        • ZooKeeper Installation
          • Upgrading ZooKeeper from an Earlier CDH 5 Release
          • Installing the ZooKeeper Packages
          • Maintaining a ZooKeeper Server
          • Viewing the ZooKeeper Documentation
      • Building RPMs from CDH Source RPMs
        • Prerequisites
        • Setting Up an Environment for Building RPMs
        • Building an RPM
      • Apache and Third-Party Licenses
        • Apache License
        • Third-Party Licenses
      • Uninstalling CDH Components
      • Viewing the Apache Hadoop Documentation
    • Troubleshooting Installation and Upgrade Problems
  • Cloudera Upgrade
    • Upgrading Cloudera Manager
      • Upgrading Cloudera Manager 5 Using Packages
      • Upgrading Cloudera Manager 5 Using Tarballs
      • Package Dependencies
    • Upgrading CDH and Managed Services Using Cloudera Manager
      • Upgrading to CDH 5.x Using a Rolling Upgrade
      • Upgrading to CDH 5.x Using Parcels
      • Upgrading to CDH 5.x Using Packages
        • Upgrade Managed Components Using a Specific Set of Packages
      • Performing Upgrade Wizard Actions Manually
      • Upgrading to CDH 5.8.0 or CDH 5.8.1 When Using the Flume Kafka Client
    • Upgrading to Oracle JDK 1.8
    • Upgrading Cloudera Navigator Components
      • Upgrading the Cloudera Navigator Data Management Component
      • Upgrading Cloudera Navigator Key Trustee Server
        • Upgrading Cloudera Navigator Key Trustee Server 3.x to 5.4.x
        • Upgrading Cloudera Navigator Key Trustee Server 3.8 to 5.5 Using the ktupgrade Script
        • Upgrading Cloudera Navigator Key Trustee Server 5.4.x or Higher
      • Upgrading Cloudera Navigator Key HSM
      • Upgrading Key Trustee KMS
      • Upgrading Couldera Navigator HSM KMS
      • Upgrading Cloudera Navigator Encrypt
    • Database Considerations for Cloudera Manager Upgrades
    • Re-Running the Cloudera Manager Upgrade Wizard
    • Reverting a Failed Cloudera Manager Upgrade
    • Upgrading Unmanaged CDH Using the Command Line
      • Upgrading from an Earlier CDH 5 Release to the Latest Release
        • Before Upgrading to the Latest Release of CDH
        • Upgrading from CDH 5.4.0 or Higher to the Latest Release
        • Upgrading from a Release Lower than CDH 5.4.0 to the Latest Release
    • Upgrading Host Operating Systems in a CDH Cluster
  • Cloudera Administration
    • Managing CDH and Managed Services
      • Managing CDH and Managed Services Using Cloudera Manager
        • Configuration Overview
          • Modifying Configuration Properties Using Cloudera Manager
          • Modifying Configuration Properties (Classic Layout)
          • Autoconfiguration
          • Custom Configuration
          • Stale Configurations
          • Client Configuration Files
          • Viewing and Reverting Configuration Changes
          • Exporting and Importing Cloudera Manager Configuration
        • Managing Clusters
          • Adding and Deleting Clusters
          • Starting, Stopping, Refreshing, and Restarting a Cluster
          • Pausing a Cluster in AWS
          • Renaming a Cluster
          • Cluster-Wide Configuration
          • Moving a Host Between Clusters
        • Managing Services
          • Adding a Service
          • Comparing Configurations for a Service Between Clusters
          • Add-on Services
          • Starting, Stopping, and Restarting Services
          • Rolling Restart
          • Aborting a Pending Command
          • Deleting Services
          • Renaming a Service
          • Configuring Maximum File Descriptors
          • Exposing Hadoop Metrics to Graphite
          • Exposing Hadoop Metrics to Ganglia
        • Managing Roles
          • Role Instances
          • Role Groups
        • Managing Hosts
          • Viewing Host Details
          • Using the Host Inspector
          • Adding a Host to the Cluster
          • Specifying Racks for Hosts
          • Host Templates
          • Performing Maintenance on a Cluster Host
            • Tuning and Troubleshooting Host Decommissioning
            • Maintenance Mode
          • Deleting Hosts
        • Cloudera Manager Configuration Properties
      • Managing CDH Using the Command Line
        • Starting CDH Services Using the Command Line
          • Configuring init to Start Hadoop System Services
        • Stopping CDH Services Using the Command Line
        • Migrating Data between Clusters Using distcp
          • Copying Cluster Data Using DistCp
          • Copying Data between a Secure and an Insecure Cluster using DistCp and WebHDFS
          • Post-migration Verification
        • Decommissioning DataNodes Using the Command Line
      • Managing Individual Services
        • Managing the HBase Service
        • Managing HDFS
          • NameNodes
            • Backing Up and Restoring HDFS Metadata
            • Moving NameNode Roles
            • Sizing NameNode Heap Memory
            • Backing Up and Restoring NameNode Metadata
          • DataNodes
            • Configuring Storage Directories for DataNodes
            • Configuring Storage Balancing for DataNodes
            • Performing Disk Hot Swap for DataNodes
          • JournalNodes
          • Configuring Short-Circuit Reads
          • Configuring HDFS Trash
          • HDFS Balancers
          • Enabling WebHDFS
          • Adding HttpFS
          • Adding and Configuring an NFS Gateway
          • Setting HDFS Quotas
          • Configuring Mountable HDFS
          • Configuring Centralized Cache Management in HDFS
          • Configuring Proxy Users to Access HDFS
          • Using CDH with Isilon Storage
          • Configuring Heterogeneous Storage in HDFS
        • Managing Hive
        • Managing Hue
          • Adding a Hue Service and Role Instance
          • Managing Hue Analytics Data Collection
          • Enabling Hue Applications Using Cloudera Manager
        • Managing Impala
          • The Impala Service
          • Post-Installation Configuration for Impala
          • Configuring Impala to Work with ODBC
          • Configuring Impala to Work with JDBC
        • Managing Key-Value Store Indexer
        • Managing Kudu
        • Managing Oozie
          • Oozie High Availability
          • Adding the Oozie Service Using Cloudera Manager
          • Redeploying the Oozie ShareLib
          • Configuring Oozie Data Purge Settings Using Cloudera Manager
          • Dumping and Loading an Oozie Database Using Cloudera Manager
          • Adding Schema to Oozie Using Cloudera Manager
          • Enabling the Oozie Web Console
          • Enabling Oozie SLA with Cloudera Manager
          • Setting the Oozie Database Timezone
          • Scheduling in Oozie Using Cron-like Syntax
          • Configuring Oozie to Enable MapReduce Jobs To Read/Write from Amazon S3
          • Configuring Oozie to Enable MapReduce Jobs To Read/Write from Microsoft Azure (ADLS)
        • Managing Solr
        • Managing Spark
          • Managing Spark Using Cloudera Manager
          • Managing Spark Standalone Using the Command Line
          • Managing the Spark History Server
        • Managing the Sqoop 1 Client
        • Managing Sqoop 2
        • Managing YARN (MRv2) and MapReduce (MRv1)
          • Managing YARN
          • Managing MapReduce
        • Managing ZooKeeper
        • Configuring Services to Use the GPL Extras Parcel
    • Performance Management
      • Optimizing Performance in CDH
      • Choosing and Configuring Data Compression
      • Tuning the Solr Server
      • Tuning Spark Applications
      • Tuning YARN
    • Resource Management
      • Static Service Pools
        • Linux Control Groups (cgroups)
      • Dynamic Resource Pools
      • YARN (MRv2) and MapReduce (MRv1) Schedulers
        • Configuring the Fair Scheduler
        • Enabling and Disabling Fair Scheduler Preemption
      • Resource Management for Impala
        • Admission Control and Query Queuing
        • Managing Impala Admission Control
      • Cluster Utilization Reports
        • Creating a Custom Cluster Utilization Report
    • High Availability
      • HDFS High Availability
        • Introduction to HDFS High Availability
        • Configuring Hardware for HDFS HA
        • Enabling HDFS HA
        • Disabling and Redeploying HDFS HA
        • Configuring Other CDH Components to Use HDFS HA
        • Administering an HDFS High Availability Cluster
        • Changing a Nameservice Name for Highly Available HDFS Using Cloudera Manager
      • MapReduce (MRv1) and YARN (MRv2) High Availability
        • YARN (MRv2) ResourceManager High Availability
        • Work Preserving Recovery for YARN Components
        • MapReduce (MRv1) JobTracker High Availability
      • Cloudera Navigator Key Trustee Server High Availability
      • Enabling Key Trustee KMS High Availability
      • Enabling Navigator HSM KMS High Availability
      • High Availability for Other CDH Components
        • HBase High Availability
          • HBase Read Replicas
        • Oozie High Availability
        • Search High Availability
      • Configuring Cloudera Manager for High Availability With a Load Balancer
        • Introduction to Cloudera Manager Deployment Architecture
        • Prerequisites for Setting up Cloudera Manager High Availability
        • Cloudera Manager Failover Protection
        • High-Level Steps to Configure Cloudera Manager High Availability
          • Step 1: Setting Up Hosts and the Load Balancer
          • Step 2: Installing and Configuring Cloudera Manager Server for High Availability
          • Step 3: Installing and Configuring Cloudera Management Service for High Availability
          • Step 4: Automating Failover with Corosync and Pacemaker
        • Database High Availability Configuration
        • TLS and Kerberos Configuration for Cloudera Manager High Availability
    • Backup and Disaster Recovery
      • Port Requirements for Backup and Disaster Recovery
      • Data Replication
        • Designating a Replication Source
        • HDFS Replication
          • HDFS Replication Tuning
          • Monitoring the Performance of HDFS Replications
        • Hive/Impala Replication
          • Monitoring the Performance of Hive/Impala Replications
        • Replicating Data to Impala Clusters
        • Using Snapshots with Replication
        • Enabling Replication Between Clusters with Kerberos Authentication
        • Replication of Encrypted Data
        • HBase Replication
      • Snapshots
        • Cloudera Manager Snapshot Policies
        • Managing HBase Snapshots
        • Managing HDFS Snapshots
      • BDR Tutorials
        • How To Back Up and Restore Apache Hive Data Using Cloudera Enterprise BDR
        • How To Back Up and Restore HDFS Data Using Cloudera Enterprise BDR
        • BDR Automation Examples
    • Cloudera Manager Administration
      • Starting, Stopping, and Restarting the Cloudera Manager Server
      • Configuring Cloudera Manager Server Ports
      • Moving the Cloudera Manager Server to a New Host
      • Managing the Cloudera Manager Server Log
      • Cloudera Manager Agents
        • Starting, Stopping, and Restarting Cloudera Manager Agents
        • Configuring Cloudera Manager Agents
        • Managing Cloudera Manager Agent Logs
      • Changing Hostnames
      • Configuring Network Settings
      • Alerts
        • Managing Alerts
          • Configuring Alert Email Delivery
          • Configuring Alert SNMP Delivery
          • Configuring Custom Alert Scripts
      • Managing Licenses
      • Sending Usage and Diagnostic Data to Cloudera
      • Exporting and Importing Cloudera Manager Configuration
      • Backing up Cloudera Manager
      • Other Cloudera Manager Tasks and Settings
      • Cloudera Management Service
    • Cloudera Navigator Administration
    • Accessing Storage Using Amazon S3
      • Configuring the Amazon S3 Connector
        • Using S3 Credentials with YARN, MapReduce, or Spark
      • Using Fast Upload with Amazon S3
      • Configuring and Managing S3Guard
      • How to Configure a MapReduce Job to Access S3 with an HDFS Credstore
    • Accessing Storage Using Microsoft ADLS
      • Configuring ADLS Access Using Cloudera Manager
      • Configuring ADLS Connectivity for CDH
    • How To Create a Multitenant Enterprise Data Hub
  • Cloudera Navigator Data Management
    • Overview
      • Cloudera Navigator Console
      • Data Stewardship Dashboard
    • Auditing
      • Using Audit Events to Understand Cluster Activity
      • Exploring Audit Data
      • Cloudera Navigator Audit Event Reports
      • Downloading HDFS Directory Access Permission Reports
    • Metadata
      • Defining Properties for Managed Metadata
      • Adding and Editing Metadata
      • Finding Specific Entities by Searching Metadata
      • Performing Actions on Entities
      • Using Policies to Automate Metadata Tagging
        • Metadata Policy Expressions
    • Lineage Diagrams
      • Using Lineage to Display Table Schema
    • Cloudera Navigator Administration
      • Navigator Audit Server Management
        • Setting Up Navigator Audit Server
        • Enabling Audit and Log Collection for Services
        • Configuring Audit and Log Properties
        • Monitoring Navigator Audit Service Health
        • Publishing Audit Events
      • Navigator Metadata Server Management
        • Setting Up Navigator Metadata Server
        • Navigator Metadata Server Tuning
        • Managing Metadata Storage with Purge
        • Hive and Impala Lineage Configuration
        • Configuring and Managing Extraction
        • Configuring the Server for Policy Messages
      • Navigator Security Management
        • Authentication and Authorization
        • Encryption (TLS/SSL) and Cloudera Navigator
        • Sensitive Data
      • Backing Up Cloudera Navigator Data
      • Administering Navigator User Roles
      • Configuring Cloudera Navigator to work with Hue HA
    • Cloudera Navigator and the Cloud
      • Using Cloudera Navigator with Altus Clusters
        • Configuring Extraction for Altus Clusters on AWS
      • Using Cloudera Navigator with Amazon S3
        • Configuring Extraction for Amazon S3
    • Cloudera Navigator APIs
      • Navigator APIs Overview
      • Applying Metadata to HDFS and Hive Entities using the API
      • Using the Purge APIs for Metadata Maintenance Tasks
    • Cloudera Navigator Reference
      • Lineage Diagram Icons
      • Search Syntax and Properties
      • Service Audit Events
      • User Roles and Privileges Reference
    • Troubleshooting Navigator Data Management
  • Cloudera Operation
    • Monitoring and Diagnostics
      • Introduction to Cloudera Manager Monitoring
        • Time Line
        • Health Tests
        • Cloudera Manager Admin Console Home Page
        • Viewing Charts for Cluster, Service, Role, and Host Instances
        • Configuring Monitoring Settings
      • Monitoring Clusters
      • Monitoring Multiple CDH Deployments Using the Multi Cloudera Manager Dashboard
        • Installing and Managing the Multi Cloudera Manager Dashboard
        • Using the Multi Cloudera Manager Status Dashboard
      • Monitoring Services
        • Monitoring Service Status
        • Viewing Service Status
        • Viewing Service Instance Details
        • Viewing Role Instance Status
          • The Processes Tab
        • Running Diagnostic Commands for Roles
        • Periodic Stacks Collection
        • Managing and Monitoring Federated HDFS
        • Viewing Running and Recent Commands
        • Monitoring Resource Management
      • Monitoring Hosts
        • Host Details
        • Host Inspector
      • Monitoring Activities
        • Monitoring MapReduce Jobs
          • Viewing and Filtering MapReduce Activities
          • Viewing the Jobs in a Pig, Oozie, or Hive Activity
          • Task Attempts
          • Viewing Activity Details in a Report Format
          • Comparing Similar Activities
          • Viewing the Distribution of Task Attempts
        • Monitoring Impala Queries
          • Query Details
        • Monitoring YARN Applications
        • Monitoring Spark Applications
      • Events
      • Triggers
        • Cloudera Manager Trigger Use Cases
      • Lifecycle and Security Auditing
      • Charting Time-Series Data
        • Dashboards
        • tsquery Language
        • Metric Aggregation
      • Logs
        • Viewing the Cloudera Manager Server Log
        • Viewing the Cloudera Manager Agent Logs
        • Managing Disk Space for Log Files
      • Reports
        • Directory Usage Report
        • Disk Usage Reports
        • Activity, Application, and Query Reports
        • The File Browser
        • Downloading HDFS Directory Access Permission Reports
      • Troubleshooting Cluster Configuration and Operation
    • Cloudera Manager Entity Types
    • Cloudera Manager Entity Type Attributes
    • Cloudera Manager Events
      • LOG_MESSAGE Category
      • SYSTEM Category
      • ACTIVITY_EVENT Category
      • AUDIT_EVENT Category
      • HBASE Category
      • HEALTH_CHECK Category
    • Cloudera Manager Health Tests
      • Active Database Health Tests
      • Active Key Trustee Server Health Tests
      • Activity Monitor Health Tests
      • Alert Publisher Health Tests
      • Beeswax Server Health Tests
      • Cloudera Management Service Health Tests
      • DataNode Health Tests
      • Event Server Health Tests
      • Failover Controller Health Tests
      • Flume Health Tests
      • Flume Agent Health Tests
      • Garbage Collector Health Tests
      • HBase Health Tests
      • HBase REST Server Health Tests
      • HBase Thrift Server Health Tests
      • HDFS Health Tests
      • History Server Health Tests
      • Hive Health Tests
      • Hive Metastore Server Health Tests
      • HiveServer2 Health Tests
      • Host Health Tests
      • Host Monitor Health Tests
      • HttpFS Health Tests
      • Hue Health Tests
      • Hue Server Health Tests
      • Impala Health Tests
      • Impala Catalog Server Health Tests
      • Impala Daemon Health Tests
      • Impala Llama ApplicationMaster Health Tests
      • Impala StateStore Health Tests
      • JobHistory Server Health Tests
      • JobTracker Health Tests
      • JournalNode Health Tests
      • Kafka Broker Health Tests
      • Kafka MirrorMaker Health Tests
      • Kerberos Ticket Renewer Health Tests
      • Key Management Server Health Tests
      • Key Management Server Proxy Health Tests
      • Key-Value Store Indexer Health Tests
      • Lily HBase Indexer Health Tests
      • Load Balancer Health Tests
      • Logger Health Tests
      • MapReduce Health Tests
      • Master Health Tests
      • Monitor Health Tests
      • NFS Gateway Health Tests
      • NameNode Health Tests
      • Navigator Audit Server Health Tests
      • Navigator HSM KMS Metastore Health Tests
      • Navigator HSM KMS Proxy Health Tests
      • Navigator Metadata Server Health Tests
      • NodeManager Health Tests
      • Oozie Health Tests
      • Oozie Server Health Tests
      • Passive Database Health Tests
      • Passive Key Trustee Server Health Tests
      • RegionServer Health Tests
      • Reports Manager Health Tests
      • ResourceManager Health Tests
      • SecondaryNameNode Health Tests
      • Sentry Health Tests
      • Sentry Server Health Tests
      • Service Monitor Health Tests
      • Solr Health Tests
      • Solr Server Health Tests
      • Spark Health Tests
      • Spark (Standalone) Health Tests
      • Spark 2 Health Tests
      • Sqoop 2 Health Tests
      • Sqoop 2 Server Health Tests
      • Tablet Server Health Tests
      • TaskTracker Health Tests
      • Tracer Health Tests
      • WebHCat Server Health Tests
      • Worker Health Tests
      • YARN (MR2 Included) Health Tests
      • ZooKeeper Health Tests
      • ZooKeeper Server Health Tests
    • Cloudera Manager Metrics
      • Accumulo Metrics
      • Accumulo 1.4 Metrics
      • Active Database Metrics
      • Active Key Trustee Server Metrics
      • Activity Metrics
      • Activity Monitor Metrics
      • Agent Metrics
      • Alert Publisher Metrics
      • Attempt Metrics
      • Beeswax Server Metrics
      • Cloudera Management Service Metrics
      • Cloudera Manager Server Metrics
      • Cluster Metrics
      • DataNode Metrics
      • Directory Metrics
      • Disk Metrics
      • Event Server Metrics
      • Failover Controller Metrics
      • Filesystem Metrics
      • Flume Metrics
      • Flume Channel Metrics
      • Flume Sink Metrics
      • Flume Source Metrics
      • Garbage Collector Metrics
      • HBase Metrics
      • HBase REST Server Metrics
      • HBase RegionServer Replication Peer Metrics
      • HBase Thrift Server Metrics
      • HDFS Metrics
      • HDFS Cache Directive Metrics
      • HDFS Cache Pool Metrics
      • HRegion Metrics
      • HTable Metrics
      • History Server Metrics
      • Hive Metrics
      • Hive Metastore Server Metrics
      • HiveServer2 Metrics
      • Host Metrics
      • Host Monitor Metrics
      • HttpFS Metrics
      • Hue Metrics
      • Hue Server Metrics
      • Impala Metrics
      • Impala Catalog Server Metrics
      • Impala Daemon Metrics
      • Impala Daemon Resource Pool Metrics
      • Impala Llama ApplicationMaster Metrics
      • Impala Pool Metrics
      • Impala Pool User Metrics
      • Impala Query Metrics
      • Impala StateStore Metrics
      • Isilon Metrics
      • Java KeyStore KMS Metrics
      • JobHistory Server Metrics
      • JobTracker Metrics
      • JournalNode Metrics
      • Kafka Metrics
      • Kafka Broker Metrics
      • Kafka Broker Topic Metrics
      • Kafka MirrorMaker Metrics
      • Kafka Replica Metrics
      • Kerberos Ticket Renewer Metrics
      • Key Management Server Metrics
      • Key Management Server Proxy Metrics
      • Key Trustee KMS Metrics
      • Key Trustee Server Metrics
      • Key-Value Store Indexer Metrics
      • Kudu Metrics
      • Kudu Replica Metrics
      • Lily HBase Indexer Metrics
      • Load Balancer Metrics
      • Logger Metrics
      • MapReduce Metrics
      • Master Metrics
      • Monitor Metrics
      • NFS Gateway Metrics
      • NameNode Metrics
      • Navigator Audit Server Metrics
      • Navigator HSM KMS Metastore Metrics
      • Navigator HSM KMS Proxy Metrics
      • Navigator HSM KMS backed by SafeNet Luna HSM Metrics
      • Navigator HSM KMS backed by Thales HSM Metrics
      • Navigator Metadata Server Metrics
      • Network Interface Metrics
      • NodeManager Metrics
      • Oozie Metrics
      • Oozie Server Metrics
      • Passive Database Metrics
      • Passive Key Trustee Server Metrics
      • RegionServer Metrics
      • Reports Manager Metrics
      • ResourceManager Metrics
      • SecondaryNameNode Metrics
      • Sentry Metrics
      • Sentry Server Metrics
      • Server Metrics
      • Service Monitor Metrics
      • Solr Metrics
      • Solr Replica Metrics
      • Solr Server Metrics
      • Solr Shard Metrics
      • Spark Metrics
      • Spark (Standalone) Metrics
      • Spark 2 Metrics
      • Sqoop 1 Client Metrics
      • Sqoop 2 Metrics
      • Sqoop 2 Server Metrics
      • Tablet Server Metrics
      • TaskTracker Metrics
      • Time Series Table Metrics
      • Tracer Metrics
      • User Metrics
      • WebHCat Server Metrics
      • Worker Metrics
      • YARN (MR2 Included) Metrics
      • YARN Pool Metrics
      • YARN Pool User Metrics
      • ZooKeeper Metrics
      • Disabling Metrics for Specific Roles
  • Cloudera Security
    • Cloudera Security Overview
      • Authentication Overview
      • Encryption Overview
        • Encryption Mechanisms
      • Authorization Overview
      • Auditing and Data Governance
    • Authentication
      • Kerberos Security Artifacts Overview
      • Configuring Authentication in Cloudera Manager
        • Cloudera Manager User Accounts
        • Configuring External Authentication for Cloudera Manager
        • Enabling Kerberos Authentication Using the Wizard
          • Step 1: Install Cloudera Manager and CDH
          • Step 2: Installing JCE Policy File for AES-256 Encryption
          • Step 3: Create the Kerberos Principal for Cloudera Manager Server
          • Step 4: Enabling Kerberos Using the Wizard
          • Step 5: Create the HDFS Superuser
          • Step 6: Get or Create a Kerberos Principal for Each User Account
          • Step 7: Prepare the Cluster for Each User
          • Step 8: Verify that Kerberos Security is Working
          • Step 9: (Optional) Enable Authentication for HTTP Web Consoles for Hadoop Roles
        • Kerberos Authentication for Single User Mode and Non-Default Users
        • Customizing Kerberos Principals
        • Managing Kerberos Credentials Using Cloudera Manager
        • Using a Custom Kerberos Keytab Retrieval Script
        • Adding Trusted Realms to the Cluster
        • Using Auth-to-Local Rules to Isolate Cluster Users
      • Configuring Authentication for Cloudera Navigator
        • Cloudera Navigator and External Authentication
          • Configuring Cloudera Navigator for Active Directory
          • Configuring Cloudera Navigator for OpenLDAP
          • Configuring Cloudera Navigator for SAML
        • Configuring Groups for Cloudera Navigator
      • Configuring Authentication in CDH Using the Command Line
        • Enabling Kerberos Authentication for Hadoop Using the Command Line
          • Step 1: Install CDH 5
          • Step 2: Verify User Accounts and Groups in CDH 5 Due to Security
          • Step 3: If you are Using AES-256 Encryption, Install the JCE Policy File
          • Step 4: Create and Deploy the Kerberos Principals and Keytab Files
          • Step 5: Shut Down the Cluster
          • Step 6: Enable Hadoop Security
          • Step 7: Configure Secure HDFS
          • Optional Step 8: Configuring Security for HDFS High Availability
          • Optional Step 9: Configure secure WebHDFS
          • Optional Step 10: Configuring a secure HDFS NFS Gateway
          • Step 11: Set Variables for Secure DataNodes
          • Step 12: Start up the NameNode
          • Step 12: Start up a DataNode
          • Step 14: Set the Sticky Bit on HDFS Directories
          • Step 15: Start up the Secondary NameNode (if used)
          • Step 16: Configure Either MRv1 Security or YARN Security
            • Configuring MRv1 Security
            • Configuring YARN Security
        • FUSE Kerberos Configuration
        • Using kadmin to Create Kerberos Keytab Files
        • Hadoop Users (user:group) and Kerberos Principals
        • Mapping Kerberos Principals to Short Names
      • Configuring Authentication for Other Components
        • Flume Authentication
          • Configuring Kerberos for Flume Sinks
          • Configuring Kerberos for Flume Thrift Source and Sink Using Cloudera Manager
          • Configuring Kerberos for Flume Thrift Source and Sink Using the Command Line
          • Flume Account Requirements
          • Testing the Flume HDFS Sink Configuration
          • Writing to a Secure HBase Cluster
          • Using Substitution Variables with Flume for Kerberos Artifacts
        • HBase Authentication
          • Configuring Kerberos Authentication for HBase
          • Configuring Secure HBase Replication
          • Configuring the HBase Client TGT Renewal Period
        • HCatalog Authentication
        • Hive Authentication
          • HiveServer2 Security Configuration
          • Hive Metastore Server Security Configuration
          • Using Hive to Run Queries on a Secure HBase Server
        • HttpFS Authentication
        • Hue Authentication
          • Configuring Kerberos Authentication for Hue
          • Enable Hue to Use Kerberos for Authentication
        • Impala Authentication
          • Enabling Kerberos Authentication for Impala
          • Enabling LDAP Authentication for Impala
          • Using Multiple Authentication Methods with Impala
          • Configuring Impala Delegation for Hue and BI Tools
        • Llama Authentication
        • Oozie Authentication
          • Configuring Kerberos Authentication for the Oozie Server
          • Configuring Oozie HA with Kerberos
        • Solr Authentication
          • Using Kerberos with Solr
        • Spark Authentication
          • Configuring Spark on YARN for Long-Running Applications
        • Sqoop 2 Authentication
        • Sqoop 1, Pig, and Whirr Security
        • ZooKeeper Authentication
      • Configuring a Dedicated MIT KDC for Cross-Realm Trust
      • Integrating MIT Kerberos and Active Directory
    • Authorization
      • Cloudera Manager User Roles
      • HDFS Extended ACLs
      • Configuring LDAP Group Mappings
      • Authorization With Apache Sentry
      • Configuring HBase Authorization
    • Data in Transit Encryption
      • Understanding Keystores and Truststores
      • Configuring Cloudera Manager Clusters for TLS/SSL
        • Level 0: Basic TLS/SSL Configuration
          • Obtain and Deploy Server Certificate
          • Enable TLS/SSL Encryption for Cloudera Manager Admin Console
        • Level 1: Enabling Encryption for the Cluster
        • Level 2: Enabling Cloudera Manager Agent Hosts to Authenticate the Server's Certificate
        • Level 3: Configuring the Cluster to Authenticate Agent Certificates
      • Configuring TLS/SSL Encryption for CDH Services
        • Configuring TLS/SSL for HDFS, YARN and MapReduce
        • Configuring TLS/SSL for HBase
        • Configuring TLS/SSL for Flume Thrift Source and Sink
        • Configuring Encrypted Communication Between HiveServer2 and Client Drivers
        • Configuring TLS/SSL for Hue
        • Configuring TLS/SSL for Impala
        • Configuring TLS/SSL for Oozie
        • Configuring TLS/SSL for Solr
        • Spark Encryption
        • Configuring TLS/SSL for HttpFS
        • Encrypted Shuffle and Encrypted Web UIs
      • Configuring TLS/SSL for Navigator Audit Server
      • Configuring TLS/SSL for Navigator Metadata Server
      • Configuring TLS/SSL for Kafka (Navigator Event Broker)
    • Data at Rest Encryption
      • Data at Rest Encryption Reference Architecture
      • Data at Rest Encryption Requirements
      • Resource Planning for Data at Rest Encryption
    • HDFS Transparent Encryption
      • Optimizing Performance for HDFS Transparent Encryption
      • Enabling HDFS Encryption Using the Wizard
      • Managing Encryption Keys and Zones
      • Configuring the Key Management Server (KMS)
      • Securing the Key Management Server (KMS)
        • Configuring KMS Access Control Lists
      • Migrating from a Key Trustee KMS to an HSM KMS
      • Migrating Keys from a Java KeyStore to Cloudera Navigator Key Trustee Server
      • Configuring CDH Services for HDFS Encryption
    • Cloudera Navigator Key Trustee Server
      • Backing Up and Restoring Key Trustee Server and Clients
      • Initializing Standalone Key Trustee Server
      • Configuring a Mail Transfer Agent for Key Trustee Server
      • Verifying Cloudera Navigator Key Trustee Server Operations
      • Managing Key Trustee Server Organizations
      • Managing Key Trustee Server Certificates
    • Cloudera Navigator Key HSM
      • Initializing Navigator Key HSM
      • HSM-Specific Setup for Cloudera Navigator Key HSM
      • Validating Key HSM Settings
      • Managing the Navigator Key HSM Service
      • Integrating Key HSM with Key Trustee Server
    • Cloudera Navigator Encrypt
      • Registering Cloudera Navigator Encrypt with Key Trustee Server
      • Preparing for Encryption Using Cloudera Navigator Encrypt
      • Encrypting and Decrypting Data Using Cloudera Navigator Encrypt
      • Migrating eCryptfs-Encrypted Data to dm-crypt
      • Navigator Encrypt Access Control List
      • Maintaining Cloudera Navigator Encrypt
    • Configuring Encryption for Data Spills
      • Configuring Encrypted On-disk File Channels for Flume
    • Impala Security Overview
      • Security Guidelines for Impala
      • Securing Impala Data and Log Files
      • Installation Considerations for Impala Security
      • Securing the Hive Metastore Database
      • Securing the Impala Web User Interface
    • Kudu Security Overview
    • Security How-To Guides
      • Add Root and Intermediate CAs to Truststore for TLS/SSL
      • Amazon S3 Security
      • Authenticate Kerberos Principals Using Java
      • Check Cluster Security Settings
      • Configure Antivirus Software on CDH Hosts
      • Configure Browser-based Interfaces to Require Authentication (SPNEGO)
      • Configure Browsers for Kerberos Authentication (SPNEGO)
      • Configure Cluster to Use Kerberos Authentication
      • Configure Encrypted Transport for HBase Data
      • Configure Encrypted Transport for HDFS Data
      • Configure TLS Encryption for Cloudera Manager
      • Convert DER, JKS, PEM Files for TLS/SSL Artifacts
      • Configure Authentication for Amazon S3
      • Configure Encryption for Amazon S3
      • Configure AWS Credentials
      • Enable Sensitive Data Redaction
      • Log a Security Support Case
      • Obtain and Deploy Keys and Certificates for TLS/SSL
      • Renew and Redistribute Certificates
      • Set Up a Gateway Node to Restrict Access to the Cluster
      • Set Up Access to Cloudera EDH or Cloudera Director (Microsoft Azure Marketplace)
      • Use Self-Signed Certificates for TLS
    • Troubleshooting Security Issues
      • Error Messages
      • Authentication and Kerberos Issues
      • HDFS Encryption Issues
      • TLS/SSL Issues
      • YARN, MRv1, and Linux OS Security
        • TaskController Error Codes (MRv1)
        • ContainerExecutor Error Codes (YARN)
  • File Formats and Compression
    • Parquet
    • Avro
    • Data Compression
    • Snappy Compression
  • Flume Guide
    • Configuring
      • Configuring the Flume Properties File
      • Files Installed by the Flume RPM and Debian Packages
      • Configuring Flume Security with Kafka
    • Using & Managing
      • Running Flume
      • Supported Sources, Sinks, and Channels
      • Viewing the Flume Documentation
  • HBase Guide
    • Configuring
      • Starting HBase in Standalone Mode
      • Configuring HBase in Pseudo-Distributed Mode
      • Deploying HBase on a Cluster
      • Accessing HBase by using the HBase Shell
      • HBase Online Merge
      • Using MapReduce with HBase
      • Configuring HBase Garbage Collection
      • Configuring the HBase Canary
      • Configuring the Blocksize for HBase
      • Configuring the HBase BlockCache
      • Configuring the HBase Scanner Heartbeat
      • Limiting the Speed of Compactions
      • Configuring and Using the HBase REST API
      • Configuring HBase MultiWAL Support
      • Storing Medium Objects (MOBs) in HBase
      • Configuring the Storage Policy for the Write-Ahead Log (WAL)
      • Using Azure Data Lake Store with HBase
    • Using & Managing
      • Starting and Stopping HBase
        • Starting and Stopping HBase Using the Command Line
      • Accessing HBase by using the HBase Shell
      • Using HBase Command-Line Utilities
      • Checking and Repairing HBase Tables
      • Hedged Reads
      • Reading Data from HBase
      • HBase Filtering
      • Writing Data to HBase
      • Importing Data Into HBase
      • Exposing HBase Metrics to a Ganglia Server
    • Security
    • Troubleshooting
  • Hive Guide
    • Installation and Upgrade
    • Configuring
      • Configuring Hive Metastore
      • Configuring HiveServer2
      • Starting the Metastore
      • File System Permissions
      • Starting, Stopping, & Using HS2
      • Starting HS1 and Hive CLI (deprecated)
      • Using Hive w/HBase
      • Using Schema Tool
      • Installing JDBC Driver on Clients
      • Setting HADOOP_MAPRED_HOME
      • Configuring HMS for HDFS HA
    • Using & Managing
      • Managing Hive with Cloudera Manager
      • Ingesting & Querying Data
      • Running Hive on Spark
      • Using HS2 Web UI
      • Accessing Table Statistics
      • Managing UDFs
      • Hive ETL Jobs on S3
      • Hive with Amazon RDS
      • Hive with ADLS
    • Tuning
      • Tuning Hive on Spark
      • Tuning Hive on S3
      • Configuring Metastore HA
      • Configuring HS2 HA
    • Data Replication
    • Security
    • Troubleshooting
  • Hue Guide
    • Hue Versions
    • Installation & Upgrade
    • Databases
      • Connect Hue to MySQL or MariaDB
      • Connect Hue to PostgreSQL
      • Connect Hue to Oracle (Parcel)
      • Connect Hue to Oracle (Package)
      • Migrate Hue Database
      • Hue Custom Database Tutorial
      • Populate the Hue Database
    • Administration
      • Hue Configuration Files
      • Hue Logs and Paths
      • Hue User Permissions
      • Create Hue Password Scripts
      • Customize Hue Web UI
    • Security
      • Configure Hue for High Availability
      • Authenticate Hue Users with LDAP
      • Synchronize Hue with LDAP Server
      • Authenticate Hue Users with SAML
      • Authorize Hue Groups with Sentry
    • Hue How-tos
      • Add Hue Load Balancer
      • Enable SQL Editor Autocompleter
      • Enable and Use Governance-Based Data Discovery
      • Enable Usage-Based Query Assistance for Hue
      • Enable S3 Cloud Storage in Hue
      • Use S3 as Source or Sink in Hue
      • Run Hue Shell Commands
    • Troubleshooting
      • Potential Misconfiguration
  • Impala Guide
    • Concepts and Architecture
      • Components
      • Developing Applications
      • Role in the Hadoop Ecosystem
    • Deployment Planning
      • Requirements
      • Cluster Sizing
      • Designing Schemas
    • Tutorials
    • Administration
      • How to Configure Resource Management for Impala
      • Setting Timeouts
      • Load-Balancing Proxy for HA
      • Managing Disk Space
      • Auditing
      • Viewing Lineage Info
    • SQL Reference
      • Comments
      • Data Types
        • ARRAY Complex Type (CDH 5.5 or higher only)
        • BIGINT
        • BOOLEAN
        • CHAR
        • DECIMAL
        • DOUBLE
        • FLOAT
        • INT
        • MAP Complex Type (CDH 5.5 or higher only)
        • REAL
        • SMALLINT
        • STRING
        • STRUCT Complex Type (CDH 5.5 or higher only)
        • TIMESTAMP
        • TINYINT
        • VARCHAR
        • Complex Types (CDH 5.5 or higher only)
      • Literals
      • SQL Operators
      • Schema Objects and Object Names
        • Aliases
        • Databases
        • Functions
        • Identifiers
        • Tables
        • Views
      • SQL Statements
        • DDL Statements
        • DML Statements
        • ALTER TABLE
        • ALTER VIEW
        • COMPUTE STATS
        • CREATE DATABASE
        • CREATE FUNCTION
        • CREATE ROLE
        • CREATE TABLE
        • CREATE VIEW
        • DELETE
        • DESCRIBE
        • DROP DATABASE
        • DROP FUNCTION
        • DROP ROLE
        • DROP STATS
        • DROP TABLE
        • DROP VIEW
        • EXPLAIN
        • GRANT
        • INSERT
        • INVALIDATE METADATA
        • LOAD DATA
        • REFRESH
        • REVOKE
        • SELECT
          • Joins
          • ORDER BY Clause
          • GROUP BY Clause
          • HAVING Clause
          • LIMIT Clause
          • OFFSET Clause
          • UNION Clause
          • Subqueries
          • TABLESAMPLE Clause
          • WITH Clause
          • DISTINCT Operator
          • Hints
        • SET
          • Query Options for the SET Statement
            • ABORT_ON_DEFAULT_LIMIT_EXCEEDED
            • ABORT_ON_ERROR
            • ALLOW_UNSUPPORTED_FORMATS
            • APPX_COUNT_DISTINCT
            • BATCH_SIZE
            • BUFFER_POOL_LIMIT
            • COMPRESSION_CODEC
            • DEBUG_ACTION
            • DECIMAL_V2
            • DEFAULT_JOIN_DISTRIBUTION_MODE
            • DEFAULT_ORDER_BY_LIMIT
            • DEFAULT_SPILLABLE_BUFFER_SIZE
            • DISABLE_CODEGEN
            • DISABLE_ROW_RUNTIME_FILTERING
            • DISABLE_STREAMING_PREAGGREGATIONS
            • DISABLE_UNSAFE_SPILLS
            • EXEC_SINGLE_NODE_ROWS_THRESHOLD
            • EXPLAIN_LEVEL
            • HBASE_CACHE_BLOCKS
            • HBASE_CACHING
            • LIVE_PROGRESS
            • LIVE_SUMMARY
            • MAX_ERRORS
            • MAX_IO_BUFFERS
            • MAX_NUM_RUNTIME_FILTERS
            • MAX_ROW_SIZE
            • MAX_SCAN_RANGE_LENGTH
            • MEM_LIMIT
            • MIN_SPILLABLE_BUFFER_SIZE
            • MT_DOP
            • NUM_NODES
            • NUM_SCANNER_THREADS
            • OPTIMIZE_PARTITION_KEY_SCANS
            • PARQUET_COMPRESSION_CODEC
            • PARQUET_ANNOTATE_STRINGS_UTF8
            • PARQUET_FALLBACK_SCHEMA_RESOLUTION
            • PARQUET_FILE_SIZE
            • PREFETCH_MODE
            • QUERY_TIMEOUT_S
            • REQUEST_POOL
            • REPLICA_PREFERENCE
            • RESERVATION_REQUEST_TIMEOUT
            • RUNTIME_BLOOM_FILTER_SIZE
            • RUNTIME_FILTER_MAX_SIZE
            • RUNTIME_FILTER_MIN_SIZE
            • RUNTIME_FILTER_MODE
            • RUNTIME_FILTER_WAIT_TIME_MS
            • S3_SKIP_INSERT_STAGING
            • SCAN_NODE_CODEGEN_THRESHOLD
            • SCHEDULE_RANDOM_REPLICA
            • SCRATCH_LIMIT
            • SUPPORT_START_OVER
            • SYNC_DDL
            • V_CPU_CORES
        • SHOW
        • TRUNCATE TABLE
        • UPDATE
        • UPSERT
        • USE
      • Built-In Functions
        • Mathematical Functions
        • Bit Functions
        • Type Conversion Functions
        • Date and Time Functions
        • Conditional Functions
        • String Functions
        • Miscellaneous Functions
        • Aggregate Functions
          • APPX_MEDIAN
          • AVG
          • COUNT
          • GROUP_CONCAT
          • MAX
          • MIN
          • NDV
          • STDDEV, STDDEV_SAMP, STDDEV_POP
          • SUM
          • VARIANCE, VARIANCE_SAMP, VARIANCE_POP, VAR_SAMP, VAR_POP
        • Analytic Functions
        • Impala User-Defined Functions (UDFs)
      • SQL Differences Between Impala and Hive
      • Porting SQL
    • The Impala Shell
      • Configuration Options
      • Connecting to impalad
      • Running Commands and SQL Statements
      • Command Reference
    • Performance Tuning
      • Performance Best Practices
      • Join Performance
      • Table and Column Statistics
      • Benchmarking
      • Controlling Resource Usage
      • Runtime Filtering
      • HDFS Caching
      • Testing Impala Performance
      • EXPLAIN Plans and Query Profiles
      • HDFS Block Skew
    • Scalability Considerations
    • Partitioning
    • File Formats
      • Text Data Files
      • Parquet Data Files
      • Avro Data Files
      • RCFile Data Files
      • SequenceFile Data Files
    • Using Impala to Query Kudu Tables
    • HBase Tables
    • S3 Tables
      • Configure with Cloudera Manager
      • Configure from Command Line
    • ADLS Tables
    • Isilon Storage
    • Logging
    • Troubleshooting Impala
      • Web User Interface
      • Breakpad Minidumps
    • Ports Used by Impala
    • Impala Reserved Words
    • Impala Frequently Asked Questions
  • Kudu Guide
    • Concepts and Architecture
    • Usage Limitations
    • Installation and Upgrade
    • Configuration
    • Administration
    • Developing Applications with Kudu
    • Using Apache Impala with Kudu
    • Schema Design
    • Transaction Semantics
    • Background Tasks
    • Troubleshooting
    • More Resources
  • Oozie Guide
    • Configuration
      • Configuring Oozie
      • Configuring an External Database for Oozie
      • Oozie High Availability
      • Configuring Oozie to Use HDFS HA
      • Oozie Authentication
      • Using Sqoop Actions with Oozie
      • Configuring Oozie to Enable MapReduce Jobs To Read/Write from Amazon S3
      • Configuring Oozie to Enable MapReduce Jobs To Read/Write from Microsoft Azure (ADLS)
    • Managing Oozie
      • Starting, Stopping, and Accessing the Oozie Server
      • Adding the Oozie Service Using Cloudera Manager
      • Redeploying the Oozie ShareLib
      • Configuring Oozie Data Purge Settings Using Cloudera Manager
      • Dumping and Loading an Oozie Database Using Cloudera Manager
      • Adding Schema to Oozie Using Cloudera Manager
      • Enabling the Oozie Web Console
      • Enabling Oozie SLA with Cloudera Manager
      • Setting the Oozie Database Timezone
      • Scheduling in Oozie Using Cron-like Syntax
  • Cloudera Search Guide
    • Cloudera Search Tutorial
      • Validating the Cloudera Search Deployment
      • Preparing to Index Sample Tweets with Cloudera Search
      • Using MapReduce Batch Indexing to Index Sample Tweets
      • Near Real Time (NRT) Indexing Tweets Using Flume
      • Using Hue with Cloudera Search
    • Deployment Planning for Cloudera Search
      • Schemaless Mode
    • Deploying Cloudera Search
      • Using Search through a Proxy for High Availability
      • Using Custom JAR Files with Search
      • Cloudera Search Security
    • Managing Cloudera Search
      • Managing Cloudera Search Configuration
      • Managing Collections in Cloudera Search
      • solrctl Reference
      • Example solrctl Usage
      • Migrating Solr Replicas
      • Backing Up and Restoring Cloudera Search
    • ETL With Cloudera Morphlines
      • Example Morphline Usage
    • Indexing Data
      • NRT Indexing
        • Flume NRT Indexing
          • Flume MorphlineSolrSink Configuration Options
          • Flume MorphlineInterceptor Configuration Options
          • Flume Solr UUIDInterceptor Configuration Options
          • Flume Solr BlobHandler Configuration Options
          • Flume Solr BlobDeserializer Configuration Options
        • Lily HBase NRT Indexing
          • Using the Lily HBase NRT Indexer Service
          • Configuring Lily HBase Indexer Security
      • Batch Indexing
        • Spark Indexing
        • MapReduce Indexing
          • MapReduceIndexerTool
          • Lily HBase Batch Indexing
          • HdfsFindTool
    • Cloudera Search Frequently Asked Questions
    • Troubleshooting Cloudera Search
      • Static Solr Log Analysis
  • Sentry Guide
    • Before You Install Sentry
    • Installing and Upgrading the Sentry Service
    • Configuring
      • Migrating from Sentry Policy Files to the Sentry Service
      • Sentry High Availability
      • Enabling Sentry Authorization for Impala
      • Configuring Sentry Authorization for Cloudera Search
    • Using & Managing
      • Synchronizing HDFS ACLs and Sentry Permissions
      • Hive SQL Syntax for Use with Sentry
      • Using the Sentry Web Server
      • Sentry Debugging and Failure Scenarios
    • Policy File Authorization
      • Installing and Upgrading Sentry for Policy File Authorization
      • Configuring Sentry Policy File Authorization Using Cloudera Manager
      • Configuring Sentry Policy File Authorization Using the Command Line
    • Troubleshooting
    • How-To Guides
      • Enabling High Availability
      • Managing Table Access in Hue
  • Spark Guide
    • Running Your First Spark Application
    • Spark Application Overview
    • Developing Spark Applications
      • Developing and Running a Spark WordCount Application
      • Using Spark Streaming
      • Using Spark SQL
      • Using Spark MLlib
      • Accessing External Storage
        • Accessing Data Stored in Amazon S3 through Spark
        • Accessing Data Stored in Azure Data Lake Store (ADLS) through Spark
        • Accessing Avro Data Files From Spark SQL Applications
        • Accessing Parquet Files From Spark SQL Applications
      • Building Spark Applications
      • Configuring Spark Applications
    • Running Spark Applications
      • Running Spark Applications on YARN
      • Using PySpark
        • Running Spark Python Applications
        • Spark and IPython and Jupyter Notebooks
      • Tuning Spark Applications
    • Spark and Hadoop Integration
      • Building and Running a Crunch Application with Spark
  • Cloudera Glossary

Cloudera Installation

This guide provides instructions for installing Cloudera software.

To upgrade Cloudera software, see Cloudera Upgrade Overview

Continue reading:

  • Configuration Requirements for Cloudera Manager, Cloudera Navigator, and CDH 5
  • Managing Software Installation Using Cloudera Manager
  • Installing Cloudera Manager and CDH
  • Installing and Deploying CDH Using the Command Line
  • Troubleshooting Installation and Upgrade Problems

Categories: Installing | Upgrading | All Categories

Cloudera Manager 5 Frequently Asked Questions
Configuration Requirements for Cloudera Manager, Cloudera Navigator, and CDH 5
  • About Cloudera
  • Resources
  • Contact
  • Careers
  • Press
  • Documentation

United States: +1 888 789 1488
Outside the US: +1 650 362 0488

© 2018 Cloudera, Inc. All rights reserved. Apache Hadoop and associated open source project names are trademarks of the Apache Software Foundation. For a complete list of trademarks, click here.

Terms & Conditions  |  Privacy Policy

Page generated April 18, 2018.