Your browser is out of date

Update your browser to view this website correctly. Update my browser now

×

Cloudera

Thanks for your interest

All webinars in the four-part series are available to watch on-demand below.

Part 1: Introducing Cloudera Data Science Workbench

Cloudera Data Science Workbench is a new tool that will enable collaborative, customizable, self-service access by data scientists to secure Hadoop environments via Python, R, and Scala. It can be installed on any existing cluster, whether on-premises or in the cloud.

Matt Brandwein, director of product management at Cloudera, and Tristan Zajonic, senior engineering manager at Cloudera, discuss:

  • The emergence of open source tools for data science
  • Common gaps in the ecosystem
  • A new tool from Cloudera
  • Demonstration
  • Q&A

Speakers

 

Director of Product Management, Cloudera

Matt Brandwein

More

Matt Brandwein is director of product management at Cloudera, driving the platform’s experience for data scientists and data engineers. Before that, Matt led Cloudera’s product marketing team, with roles spanning product, solution, and partner marketing. Previously, he built enterprise search and data discovery products at Endeca/Oracle. Matt holds degrees in computer science and mathematics from the University of Massachusetts Amherst.

Sr. Engineering Manager, Cloudera

Tristan Zajonc

More

Tristan Zajonc is a senior engineering manager at Cloudera. Previously, he was cofounder and CEO of Sense, a visiting fellow at Harvard’s Institute for Quantitative Social Science, and a consultant at the World Bank. Tristan holds a PhD in public policy and an MPA in international development from Harvard and a BA in economics from Pomona College.


Part 2: A Visual Dive into Machine Learning and Deep Learning

Cloudera Data Science Workbench helps data scientists get ready-access to Hadoop data, leverage the newest machine learning and deep learning frameworks and deliver value much quicker; all in a secure environment.

Sean Anderson, senior manager of data science marketing at Cloudera, and Vartika Singh, solutions architect for data science at Cloudera, discuss:

  • An introduction to machine learning and deep learning
  • Common practices and tools
  • A new tool from Cloudera
  • Demonstration
  • Q&A

Speakers

 

Solutions Architect, Cloudera

Vartika Singh

More

Vartika Singh is a solutions architect at Cloudera with over 12 years of experience in applying machine learning technologies to industry problems ranging from advertising to imaging.

Sr. Product Marketing Manager, Cloudera

Sean Anderson

More

Sean is a marketing manager for IT Solutions at Cloudera. He is a tenured infrastructure scaling and cloud strategy consultant with a strong focus on strategic partnerships and innovative hybrid technology. Sean quickly became a go-to resource and speaker for data specific workloads focusing on technologies like Hadoop, MongoDB, Redis, Elasticsearch, SQL, and Data Warehousing. Sean is currently marketing manager for IT Solutions at Cloudera; the pioneers of Apache Hadoop.


Part 3: Models in Production: A Look From Beginning to End

Apache Hadoop can support all stages of the data science lifecycle, but how this is done is still more art than science because it requires coordinating different teams and technologies. This webinar demonstrates a simple reference architecture for connecting the output of exploratory data science in Cloudera Data Science Workbench with production deployment on Hadoop. This includes data engineering with Spark, modeling with Spark MLlib, and production build and deployment via git, Maven, and Spark Streaming.

Speakers

 

Director, Data Science, Cloudera

Sean Owen

More

Sean is Director of Data Science at Cloudera in London. Before Cloudera, he founded Myrrix Ltd. (now the Oryx project) to commercialize large-scale real-time recommender systems on Apache Hadoop. He is an Apache Spark committer and a co-author of O’Reilly Media’s Advanced Analytics with Spark. He was a committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google. He holds an MBA from London Business School and a BA from Harvard University.

Sr. Product Marketing Manager, Cloudera

Sean Anderson

More

Sean is a marketing manager for IT Solutions at Cloudera. He is a tenured infrastructure scaling and cloud strategy consultant with a strong focus on strategic partnerships and innovative hybrid technology. Sean quickly became a go-to resource and speaker for data specific workloads focusing on technologies like Hadoop, MongoDB, Redis, Elasticsearch, SQL, and Data Warehousing. Sean is currently marketing manager for IT Solutions at Cloudera; the pioneers of Apache Hadoop.


Part 4: Cloudera Data Science Workbench: sparklyr, implyr, and More: dplyr Interfaces to Large-Scale Data

When working with various data sources, dplyr can function differently and present a few challenges. In this webinar, Ian Cook, R contributor and data scientist at Cloudera, will discuss sparklyr (from RStudio) and the package implyr (from Cloudera). He’ll show you how to write dplyr code that works across these different interfaces.

Speaker

 

Sr. Curriculum Developer, Cloudera

Ian Cook

More

Ian Cook is a data scientist at Cloudera and the author of several R packages including implyr. Previously, Ian was a data scientist at TIBCO and a statistical software developer at Advanced Micro Devices. Ian is cofounder of Research Triangle Analysts, the largest data science meetup group in the Raleigh, North Carolina, area, where he lives with his wife and two young children. He holds an MS in statistics from Lehigh University and a BS in applied mathematics from Stony Brook University.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extention blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.