CDAP is integrated with Cloudera Enterprise.
Using the CDAP Custom Services Descriptor for Cloudera Manager, CDAP can be launched, upgraded and monitored directly from within Cloudera Manager. CDAP is also integrated with Impala, which dramatically simplifies the process of ingesting, streaming and batch loading data into Impala.
The Cask Data Application Platform (CDAP) provides an abstraction layer and runtime services enabling developers, even those without extensive Hadoop experience, to innovate and deliver new big data applications.
- System Requirements
- Quickstart Guide
CDAP is available in three versions. The standalone SDK enables developers to develop and debug applications. The standalone VM provides the same functionality, and is recommended for Windows users. The distributed version of CDAP is available in RPM and DEB bundles. Applications developed on the standalone versions of CDAP can be deployed without modification on the CDAP distributed package running on a CDH cluster.
Supporting the installation and management of CDAP is the CDAP CSD (Custom Services Descriptor) for Cloudera Manager.
Supported Operating Systems
CDAP supports these operating systems:
- Red Hat Enterprise Linux and CentOS 5.7, 64-bit
- Red Hat Enterprise Linux and CentOS 5.10, 64-bit
- Red Hat Enterprise Linux and CentOS 6.4, 64-bit
- Red Hat Enterprise Linux and CentOS 6.4 in SE Linux Mode
- Red Hat Enterprise Linux and CentOS 6.5, 64-bit
Supported JDK versions
CDAP supports JDK1.7.x or JDK1.8.x. Please refer to Cloudera Manager requirements for installing and upgrading Java.
CDAP Console supports these browsers:
- Firefox 11 or later
- Google Chrome
- Safari 5 or later
CDAP Console requires Node.js version 0.10.* or higher.
Additional Configuration Instructions
Certain YARN containers launched by CDAP connect to Zookeeper. It is recommended that 'maxClientCnxns' be set to zero (unlimited).
Kerberos-enabled clusters require additional settings and setup which are not currently managed by Cloudera Manager:
The 'cdap' user needs to be granted HBase permissions to create tables. Run "grant 'cdap', 'CRW'” in an HBase shell.
The 'cdap' user must be able to launch YARN containers, often by adding it to the YARN "allowed.system.users".
Confirm that YARN is configured properly to run MapReduce programs. Often, this includes ensuring that the HDFS "/user/yarn" directory exists with proper permissions.
Lower the default minimum YARN container size by adjusting the configuration "yarn.scheduler.minimum-allocation-mb" appropriately.
Versions of HIVE can attempt to create a temporary staging directory at the table location when executing queries. If there are permission issues observed when running a query, set "hive.exec.stagingdir" in your HIVE configuration to a temporary directory such as "/tmp/hive-staging". This can be set through Cloudera Manager under the "Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml" configuration field.
Quickstart guide for installing and using CDAP within Cloudera Manager
- Install CDAP Custom Service Descriptor (CSD)
- Download and Distribute the CDAP parcel
- Run the "Add Service" Wizard and select “CDAP”:
- Wizard Page 2: Optional Hive dependency is for the optional CDAP “Explore" component which can be enabled
- Wizard Page 3: CDAP "Security Auth" Service is an optional service for CDAP perimeter security; it can be configured and enabled
- Wizard Page 5: "Kerberos Auth Enabled” is needed if running against a secure Hadoop cluster.
- Wizard Page 5: "Router Server Port": Should match the "Router Bind Port”; it’s used by the UI to connect to the Router service.
After the Setup Wizard completes, the “Quick Link” from the “Cask DAP” service should load the UI. (By default, port
9999 of the host where the Web-App role instance is running.) The UI may initially show errors while all the CDAP
YARN containers are starting up. Allow up to a few minutes for this. The "System Health" section on the Overview page show the status of the CDAP services. They should all turn green, showing completion of startup.