Introduction to Data Science
Cloudera University’s three-day introduction to data science develops the skills required to build information platforms and analytical tools that reduce costs, increase profits, improve products, retain customers, and identify new opportunities. This course is part of both the developer learning path and the data analyst learning path.
The Cloudera instructor was excellent, offering clear and concise instruction that was easy to understand. His wide-ranging peripheral knowledge helped apply the course material to real-world situations. I look forward to attending another class!
Find a class near you
Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:
- The role of data scientists, vertical use cases, and business applications of data science
- Where and how to acquire data, methods for evaluating source data, and data transformation and preparation
- Types of statistics and analytical methods and their relationship
- Machine learning fundamentals and breakthroughs, the importance of algorithms, and data as a platform
- How to implement and manage recommenders using Apache Mahout and how to set up and evaluate data experiments
- Steps for deploying new analytics projects to production and tips for working at scale
Audience & Prerequisites
This course is best suited to developers, data analysts, and statisticians with basic knowledge of Apache Hadoop: HDFS, MapReduce, Hadoop Streaming, and Apache Hive. Students should have proficiency in a scripting language; Python is strongly preferred, but familiarity with Perl or Ruby is sufficient.