San Jose State University Creates ‘Data Wranglers’ in Partnership with Cloudera

Overview

San Jose State University (SJSU) is the oldest public institution of higher education on the United States' West Coast. The school was founded in 1857 to train teachers for the “developing frontier.”1 A lot has changed since then, but SJSU’s present-day tagline, “powering Silicon Valley,” demonstrates a consistent goal: to arm students with the knowledge and experience that will help them thrive in today’s science- and technology-driven market.


“The support of Cloudera with the Cloudera Academic Partnership has been incredible. It provides slides and courseware that we can use for teaching the theory, and also allows students to get the experience they need. It’s not only theoretical; it is also very practical.”


Peter Zadrozny, Professor, SJSU

One of SJSU’s adjunct professors, Peter Zadrozny, educates students on Big Data analytics. Zadrozny brings a wealth of real-world experience to the courses he teaches, with a background as a software architect and developer at companies ranging from start-ups to the Fortune 500. His goal is to offer students hands-on experience with Big Data technologies that hiring managers are looking for.

Use Case

In developing the curriculum for SJSU’s Big Data Analytics course, Zadrozny decided the logical approach would be to teach Apache Hadoop and Splunk. Within the Hadoop curriculum, students learn Hive, which leverages existing analytical skills, including SQL, for the Big Data sets at the core of the emerging data economy.

A key part of the course is having students deliver a Big Data project that demonstrates they know how to work with the tools. In addition to partnering with Cloudera and Splunk, SJSU has established a partnership with GoGrid to give students a cloud-based environment on which to build their Big Data projects.

Zadrozny led SJSU’s participation in the Cloudera Academic Partnership (CAP) program to streamline and accelerate the Hadoop curriculum development. He noted, “When people think Hadoop, they think Cloudera. I have to give students something that makes them marketable. If I don’t teach Hadoop on Cloudera, their chances of getting a job are slimmer.”

As part of the CAP program, Cloudera provides SJSU with:

For their projects, students gain experience working with live data from sources such as the Federal Aviation Administration, Foursquare, IMDb, Twitter, and Yelp. They learn how to set up a Hadoop cluster, load data, query it using Hive, verify that their queries are running properly, and then visualize and communicate the results of their analyses.

“We encourage students to tell a story with the data,” explained Zadrozny. “As you start digging into it, you find interesting things, unusual facts, things that you wouldn’t have anticipated or that are historically relevant.”

Impact: Improved Marketability Through Practical Education

"Whenever I go to job fairs, if I have something on my resume about Hive, Hadoop, or Big Data, that’s what hiring managers ask about," said Tanuvir Singh, a student of the course pursuing his master’s degree in computer science.

The Big Data Analytics course at SJSU is very popular, largely due to its integration of hands-on exercises. “The support of Cloudera with the Cloudera Academic Partnership has been incredible,” commented Zadrozny. “It provides slides and courseware that we can use for teaching the theory, and also allows students to get the experience they need. It’s not only theoretical; it is also very practical.”

1http://www.sjsu.edu/about_sjsu/facts_and_figures/index.html