Developer Resources: Application Development
The Kite SDK (formerly Cloudera Development Kit) is a set of libraries, tools, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem.
- Kite SDK home
- Presentation: Building Apps on Hadoop with the CDK (from InfoQ NYC 2013)
There are a number of connectors available for accessing Hadoop data via ODBC or JDBC.
- Cloudera Connectors for Teradata, Netezza, MicroStrategy, Tableau, and Oracle
- Apache ODBC driver for Apache Hive
- Apache JDBC driver for Apache Hive
- Progress DataDirect Hadoop Apache Hive ODBC Driver
- Simba's Apache Hive ODBC Driver
- Configuring Impala to Work with ODBC
YARN stands for “Yet-Another-Resource-Negotiator”. It provides the daemons and APIs necessary to develop generic distributed applications of any kind (MRv2 being one such application), handles and schedules resource requests (such as memory and CPU) from such applications, and supervises their execution. (Note: Developing a new YARN application is only required if MapReduce, Pig, Hive, Impala, Crunch, etc do not meet your needs.)
- MR2 and YARN Briefly Explained
- Writing YARN Applications
- Migrating to MapReduce 2 on YARN: For users | For operators
- Managing Multiple Resources on Hadoop 2 with YARN
- YARN REST APIs
Use these REST APIs for integrating Hadoop components with external tools and apps.
|MapReduce||MapReduce Application Master REST APIs|
|Apache Oozie||Oozie Web Services API|
Apache Thrift APIs
The Thrift APIs in HDFS and HBase makes it easier for non-Java applications to access Hadoop data by exposing them as Apache Thrift services, making it easy for any non-JVM language that has Thrift bindings to interact with them.
- Using perl and Thrift to access HDFS (via Gino Ledesma)
- Using the HBase Thrift Gateway from Python (Sample Chapter from HBase in Action)