small image Ebook: Apache NiFi for Dummies

 

 

logo-cask-orange

The Cask Data Application Platform (CDAP) provides an abstraction layer and runtime services enabling developers, even those without extensive Hadoop experience, to innovate and deliver new big data applications.

 

CDAP is available in three versions. The standalone SDK enables developers to develop and debug applications. The standalone VM provides the same functionality, and is recommended for Windows users. The distributed version of CDAP is available in RPM and DEB bundles. Applications developed on the standalone versions of CDAP can be deployed without modification on the CDAP distributed package running on a CDH cluster.

 

Supporting the installation and management of CDAP is the CDAP CSD (Custom Services Descriptor) for Cloudera Manager.

 

CDAP supports these operating systems:

  • Red Hat Enterprise Linux and CentOS 5.7, 64-bit
  • Red Hat Enterprise Linux and CentOS 5.10, 64-bit
  • Red Hat Enterprise Linux and CentOS 6.4, 64-bit
  • Red Hat Enterprise Linux and CentOS 6.4 in SE Linux Mode
  • Red Hat Enterprise Linux and CentOS 6.5, 64-bit
Selected tab: supportedoperatingsystems

CDAP supports JDK1.7.x or JDK1.8.x. Please refer to Cloudera Manager requirements for installing and upgrading Java.

Selected tab: supportedjdkversions
Selected tab: supportedbrowsers

CDAP Console requires Node.js version 0.10.* or higher.

Selected tab: supportednodejs

Certain YARN containers launched by CDAP connect to Zookeeper. It is recommended that 'maxClientCnxns' be set to zero (unlimited).

 

Kerberos-enabled clusters require additional settings and setup which are not currently managed by Cloudera Manager:

  • The 'cdap' user needs to be granted HBase permissions to create tables. Run "grant 'cdap', 'CRW'” in an HBase shell.  
  • The 'cdap' user must be able to launch YARN containers, often by adding it to the YARN "allowed.system.users".

Confirm that YARN is configured properly to run MapReduce programs. Often, this includes ensuring that the HDFS "/user/yarn" directory exists with proper permissions.

 

Lower the default minimum YARN container size by adjusting the configuration "yarn.scheduler.minimum-allocation-mb" appropriately. 

 

Versions of HIVE can attempt to create a temporary staging directory at the table location when executing queries. If there are permission issues observed when running a query, set "hive.exec.stagingdir" in your HIVE configuration to a temporary directory such as "/tmp/hive-staging". This can be set through Cloudera Manager under the "Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml" configuration field.

Selected tab: additionalconfiguration
Selected tab: systemrequirements
Selected tab: documentation

Quickstart guide for installing and using CDAP within Cloudera Manager

 

Setup:

1) Install CDAP Custom Service Descriptor (CSD)

2) Download and Distribute the CDAP parcel

3) Run the "Add Service" Wizard and select “CDAP”:

Wizard Page 2: Optional Hive dependency is for the optional CDAP “Explore" component which can be enabled

Wizard Page 3: CDAP "Security Auth" Service is an optional service for CDAP perimeter security; it can be configured and enabled

Wizard Page 5: "Kerberos Auth Enabled” is needed if running against a secure Hadoop cluster.

Wizard Page 5: "Router Server Port": Should match the "Router Bind Port”; it’s used by the UI to connect to the Router service.

 

Startup:

After the Setup Wizard completes, the “Quick Link” from the “Cask DAP” service should load the UI. (By default, port 9999 of the host where the Web-App role instance is running.) The UI may initially show errors while all the CDAP YARN containers are starting up.  Allow up to a few minutes for this. The "System Health" section on the Overview page show the status of the CDAP services.​ They should all turn green, showing completion of startup.

 

How-to-guides for using CDAP to build applications are available

Selected tab: quickstartguide

Want to Get Involved or Learn More?

Check out our other resources

Cloudera Community

Collaborate with your peers, industry experts, and Clouderans to make the most of your investment in Hadoop.

Cloudera Educational Services

Receive expert Hadoop training through Cloudera Educational Services, the industry's only truly dynamic Hadoop training curriculum that’s updated regularly to reflect the state of the art in big data.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.