Your browser is out of date!

Update your browser to view this website correctly. Update my browser now


Course Prerequisites

Administrator: This course is best suited to systems administrators and IT managers who have basic Linux experience. Prior knowledge of Apache Hadoop is not required.

Data Analyst: This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Knowledge of SQL is assumed, as is basic Linux command-line familiarity. Knowledge of at least one scripting language (e.g., Bash scripting, Perl, Python, Ruby) would be helpful but is not essential. Prior knowledge of Apache Hadoop is not required.

Developer for Spark & Hadoop: This course is designed for developers and engineers who have programming experience. Apache Spark examples and hands-on exercises are presented in Scala and Python, so the ability to program in one of those languages is required. Basic knowledge of SQL is helpful; prior knowledge of Hadoop is not required.

Search: This course is intended for developers and data engineers with at least basic familiarity with Hadoop and experience programming in a general-purpose language such as Java, C, C++, Perl, or Python. Participants should be comfortable with the Linux command line and should be able to perform basic tasks such as creating and removing directories, viewing and changing file permissions, executing scripts, and examining file output. No prior experience with Apache Solr or Cloudera Search is required, nor is any experience with HBase or SQL.

HBase: This course is best suited to developers and administrators who have experience with databases and data modeling, although it is not required. Prior knowledge of Apache Hadoop is not required.

Data Scientist Training: This course is designed for data scientists who currently use Python or R to work with smaller datasets on a single machine and who need to scale up their analyses and machine learning models to large datasets on distributed clusters. Data engineers and developers with some knowledge of data science and machine learning may also find this workshop useful. Workshop participants should have a basic understanding of Python or R and some experience exploring and analyzing data and developing statistical or machine learning models. Knowledge of Hadoop or Spark is not required.

Intro to Machine Learning: This course does not have prerequisites, but student must know Python or Scala to understand the material covered. Please note that this course does not teach big data concepts, nor does it cover how to use Cloudera software. Instead, it is meant as a follow up to our Developer Training for Spark and Hadoop course.

To view course setup requirements please click here.

If you have any questions, please contact