Your browser is out of date

Update your browser to view this website correctly. Update my browser now


50+ Leading Data Experts from Top Companies Cover Topics from the Edge to AI

Cloudera Now is the world’s premier online big data event designed for architects, analysts, data scientists, engineers, IT specialists, developers and DBAs. Don’t miss this chance to hear about the latest developments in AI, Machine Learning, IoT, Cloud, and more in over 50 sessions, technical deep dives, demonstrations and customer use cases.

Get answers to your top questions while learning how to apply open source technology to accelerate your digital transformation in any cloud from the Edge to AI.


Doug Cutting, Chief Architect, Cloudera

Journey to the Enterprise Data Cloud

Doug Cutting, Hadoop Founder, Cloudera

Hilary Mason, GM Machine Learning

Introducing Cloudera Data Science Workbench 1.6 and New ML Services

Hilary Mason, GM Machine Learning, Cloudera

Rachel Mushahwar

Vortex of Change, Business Metamorphosis (In a Digital Age)

Rachel Mushahwar, Vice President and General Manager of US Sales, Intel

Holden Karau, Developer Advocate, Google

Building Cross-Cloud Pipelines with Kubeflow

Holden Karau, Developer Advocate, Google

Why Attend?

Big data, data science, and big innovation are in your DNA. Join your peers at Cloudera Now where you’ll hear industry experts and practitioners as they share real-world use-cases, success stories, and best practices.

Built for a technical audience

  • Architects
  • Analysts
  • Data Scientists
  • Engineers 
  • IT Specialists 
  • Developers 
  • DBAs 
  • Data Practitioners 
  • IoT Experts

Action oriented topics

  • Migration to the cloud, hybrid cloud, and multi-cloud

  • How to do Machine Learning at industrial scale

  • The latest Apache advances such as Kubernetes, Nifi, etc.

  • and much more

Access to 50+ sessions

  • Free event 
  • On-demand access after the event
  • 30+ technologies


Sessions by Category

Featured Speakers

Welcome keynote

Doug Cutting, Hadoop Founder, Cloudera

Building Cross-Cloud Pipelines with Kubeflow

Holden Karau, Developer Advocate, Google

Cloudera's Internal Big Data Projects - Keys To Success

Amy O'Connor, Chief Data & Information Officer, Cloudera | Alan Jackoway, Senior Corporate EDH Manager, Cloudera | Rob Johnson, Director Data Engineering & Data Science, Cloudera

Data Architectures for the Hybrid and Multi Cloud-Era

Arun C. Murthy, Chief Product Officer, Cloudera

Finding Business Value in Big Data

Bill Inmon, Founder, Chairman, CEO, Best-selling Author, Forest Rim Technology

Big Data covers quite a bit of territory. Some forms of Big Data have a small amount of business value. Other forms of Big Data have an enormous amount of business value. This presentation discusses where that business value is and how to start turning Big Data into a cash machine.

Sponsor Sessions

Lessons Learned from Five Real AI Deployments

David Tareen, Global Product Marketing for AI, SAS

What happens when you deploy AI for machine learning, computer vision and natural language processing projects? Come learn from our experience of deploying AI projects around the world. The business challenges are as diverse as the stories and you will walk away with five solid lessons learned that can be applied to AI projects anywhere.

Redesigning Databases for Persistent Memory Technology

Jason Lamb, Cloud / Software Defined Infrastructure Technology Solutions Specialist, Intel

Vortex of Change, Business Metamorphosis (In a Digital Age)

Rachel Mushahwar, Vice President and General Manager of US Sales, Intel

OpenShift 4 Release Update

Siamak Sadeghianfar, Principal Product Manager, Red Hat | Daniel Messer, Senior Product Manager, Red Hat

Evolution of Apache HBase in Azure HDInsight

Ron Abellera, Cloud Architect, Microsoft | Gaurav Kanade, Software Engineer, Microsoft

Climbing the AI Ladder

Rob Thomas, General Manager, IBM Data and Watson AI

Every company is on a journey toward AI, IBM wants to help you get there. Find out how to collect, organize and analyze data and infuse AI into your business processes.

Modernize your Data Warehouse with Informatica and Cloudera

Sam Tawfit, Principal Marketing Manager, Informatica | Premchand Bellamkonda, Senior Product Manager, Informatica

In this webinar you will learn how Informatica and Cloudera together help you deliver a proven solution to rapidly transform data into trusted information with comprehensive data lake management. We will discuss key data warehouse modernization trends and disruptors and the need for a modern data warehouse solution to address new requirements. The solution helps you operationalize your data management and deliver self-service data preparation with governance.

Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Events Daily

Joshua Robinson, Founding Engineer, FlashBlade, Pure Storage

Learn how Pure Storage engineering manages streaming 190B log events per day and makes use of that deluge of data in our continuous integration (CI) pipeline. Our test infrastructure runs over 70,000 tests per day creating a large triage problem that would require at least 20 triage engineers. Instead, Spark's flexible computing platform allows us to write a single application for both streaming and batch jobs to understand the state of our CI pipeline for our team of 3 triage engineers.

Bullet-Proof Your Hybrid Cloud Strategy

Dave Mariani, Founder and Chief Strategy Officer, AtScale

Human-Centered Data Science

Ian McCulloh, Chief Data Scientist, PhD, Accenture Federal Services

The modern information environment has changed the paradigm for decision making and the way we deliver government services to citizens. We used to live in an age of information scarcity, where data needed to be carefully collected and curated. Today, we live in an age of information superabundance, where we must decide what to discard. Throughout this process, it is critically important to keep humans in the center and to identify the problems and decisions they must address. This talk will review some successes and challenges with government adoption of data science. Successful delivery requires that humans remain at the center of any big data solution.

Cloudera's New Cloud - Cloudera Data Platform (CDP)

Cloudera Data Platform: Cloudera's New Cloud - Product Demos

Fred Koopmans, VP Product Management, Cloudera | Ram Venkatesh, VP Engineering, Cloudera | Naren Koneru, Sr. Director, Engineering, Cloudera | Vidya Raman, Director of Product Management, Cloudera | Santosh Kumar, Senior Product Manager, Cloudera

See live demo's of Cloudera's new cloud offering the Cloudera Data Platform (CDP). The only Enterprise Data Cloud to encompass hybrid and multi-cloud, multi-function analytics, secure and governed, all built on open source.

Experience the First Hybrid Multi-Cloud Data Warehouse on CDP

David Dichmann, Director of Product Marketing, Cloudera | Sid Shaik, Director of Product Management, Cloudera

Learn how Cloudera's Data Warehouse Experience on the Cloudera Data Platform can evaluate current workloads, identify candidates to take to the cloud, seamlessly transition to the cloud with auto-scale and auto-suspend, and more.

Architecture, Operations and Cloud

Five Steps to Aligning Your Data and Cloud Strategies

Charles Boicey, Chief Innovation Officer, ClearSense | Rohit Balasubramanian, Managing Director, Deloitte | Wim Stoop, Product Marketing Manager, Cloudera

Panel discussion on hybrid cloud strategy, deploying applications, data, and infrastructure on a combination of on-premises and cloud resources.

Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environment to run Mixed Workloads

Sunil Govindan, Staff Engineer, Cloudera | Weiwei Yang, Software Engineer, Cloudera

Use a common scheduler powered from YARN and K8s’s legacy capabilities and improvement towards cloud use cases for better: bin-packing scheduling (and gang scheduling), autoscale up and shrink policy management, and effectively run batch workloads and services with clear SLA’s

Cloudera Data Platform: Cloudera's new Cloud - Product Demos

Fred Koopmans, VP Product Management, Cloudera | Ram Venkatesh, VP Engineering, Cloudera | Naren Koneru, Sr. Director, Engineering, Cloudera | Vidya Raman, Director of Product Management, Cloudera | Santosh Kumar, Senior Product Manager, Cloudera | David Dichmann, Director of Product Marketing, Cloudera | Sid Shaik, Director of Product Management, Cloudera

See live demo's of Cloudera's new cloud offering the Cloudera Data Platform (CDP). The only Enterprise Data Cloud to encompass hybrid and multi-cloud, multi-function analytics, secure and governed, all built on open source.

Audi's Hadoop Journey into the Hybrid Cloud

Carsten Herbe, Big Data Architect, Audi Business Innovation GmbH

Audi explains their motivation, requirement, tools and steps they took to build a Hadoop platform in AWS to extend the on-premise Hadoop cluster to a hybrid platform.

Fast Access to your Complex Data: Avro, JSON, ORC, and Parquet

Owen O'Malley, Co-founder & Technical Fellow, Cloudera

Hear about the benchmarks from Spark including the new work that radically improved the performance of Spark on ORC. Learn tips and suggestions to optimize the performance of your application while reading and writing the data.

How Walmart's Big Data Solution Saved Finance Business Users Time and Money

Sridhar Leekkala, Sr Manager II Software Engineering, Walmart | Paritosh Sundriyal, Software Engineer II, Walmart

Walmart discusses how their Data Lake initiative is developed to help them reduce/eliminate efforts on data mining and cleansing; enabling them to focus on data analytics and making insightful, well-informed, and collaborative decisions.

Cloud Automation Journey at Walgreens

Gupta Narayanam, Technical Architect, Walgreens | Sridhar Ramalingam, Technical Architect, Walgreens

Walgreens explains using the automation processes to build and maintain Hadoop clusters to significantly reduce the build and delivery times for new Hadoop environments in Azure.

Growing a Better Data Lake Together

Edwin Scheepstra, Business Analyst, Rabobank | Jeroen Wolffensperger, Solution Architect Data, Rabobank

Rabobank shares insights into how they optimize value for customers and their success story on transforming the Rabobank to a digital convenience bank using big data technologies.

Real-time Analytics at PayPal

Sanjeev Koranga, Engineering Manager, PayPal | Shobana Neelakantan, Engineering Manager, PayPal

PayPal shares their journey into realtime analytics -- how they are processing and handling real time data at scale using Apache Kafka, Spark streaming and Akka streaming, what limitation and challenges are involved with real time data pipeline and how signals from realtime analytics are used for quick feedback and decision making.

Data Warehousing and Operational Data Stores

An Independent Perspective on the Modern Data Warehouse

Richard Winter, CEO, WinterCorp

The data warehouse concept emerged around 1990, was fully formed by around 2000 and led to a boom in the implementation of data warehouses and data marts over the years that followed. A data warehouse is much more than the relational database platform on which it runs. It also has a data model, an ETL process that feeds it data, and a substantial set of processes around its governance, its evolution and its implementation. 

While data warehouses have met many important business needs for data and decision making, the world has changed substantially over the last 10-15 years. Many data warehouse owners are now confronting requirements that these earlier programs were never designed to address. This session will describe the requirements and opportunities that data warehouse program leaders face today; relevant technology trends; and, approaches to consider moving forward, from the perspective of an independent consultant.

Modern Data Warehouse Fundamentals: Introduction, Performance Optimization with Impala & Hive LLAP, Workload Management, and Hybrid & Multi-Cloud

David Dichmann, Director Product Marketing, Cloudera | Greg Rahn, Director Product Management, Cloudera | Raman Rajasekhar, Sr. Product Manager, Cloudera | Santosh Kumar, Sr. Product Manager, Cloudera | Sid Shaik, Director Product Management, Cloudera

Explore new trends and use cases in data warehousing including data warehouse optimization, operations data warehousing, and discovery data warehousing. In this session experts will share techniques, technologies and architectures that support modern data warehouse use cases including support for self-service ad-hoc analysis, predictive analytics and intelligent workload management. Modern Data Warehousing Fundamentals will show how to modernize your data warehouse for benefits to both traditional analytics practitioners and all new use cases.

Time Series Analytics with the Cloudera Data Warehouse

Eva Nahari, Director Product, Cloudera | Dave Shuman, Industry and Solution Leader, Cloudera | Jeremy Beard, Data Warehouse Field Engineer, Cloudera

What do predictive maintenance, increasing availability of manufacturing equipment, optimizing for peak behavior to support grid or network performance, and preventing threats all have in common? They all require analysis of time series data at scale, in real time. Does your infrastructure have the power, storage capacity and real time user access necessary to deliver high quality time series analytics? In this session, we will explore the use cases that are driving manufacturing, energy and telecommunications companies to invest in time series analytics, and an architecture from Cloudera designed to enable 10x increase in sample frequency with over 80% decrease in operational costs for time series use cases.

Data Warehouse Optimization: See How Others Have Optimized Exadata, Teradata & Netezza Data Warehouses with Cloudera

David Dichmann, Director Product Marketing, Cloudera | Raman Rajasekhar, Senior Product Manager, Cloudera

With Data Warehousing as the backbone of every data-driven organization, we need to prepare for the data tsunami challenge driven by a new variety of high-value BI and analytics use cases that we need to capitalize on. Traditional data warehouses can no longer cost-effectively meet all these new demands. This session will show how using a modern data warehouse as an optimizer, we can manage the data explosion while shortening the data warehouse lifecycle, support hundreds of new use cases that arise from making this new data available, while we cost-effectively scale, dynamically, across a choice of environments, spanning the data center and the cloud.

Apache Phoenix: Best New Feature for DBMS

Krishna Maheshwari, Director Product Management, Cloudera | Henry Sowell, Director Solutions Engineering, Cloudera | Timothy Spann, Field Engineer, Data in Motion, Cloudera

Cloudera announces support for Apache Phoenix with Cloudera Enterprise Data Hub (CDH). Apache Phoenix provides an RDBMS-like experience with scale-out architecture for custom operational & transactional applications. It is an extension to Apache HBase that provides a programmatic ANSI SQL interface, secondary indexes, foreign key support, and more.

Tips, Tricks and How-Tos to Become a Cloudera Data Warehouse Guru

Joydeep Das, VP Product Management, Cloudera

Have you ever wanted to run large scale data warehouse in hybrid and multi cloud? I bet you wish you could explore your data the moment it is ingested with Cloudera Data Warehouse, on your own? Did you know you could get a complete view of all your workloads and proactively manage them yourself? Joydeep Das, head of Product Management for Cloudera Data Warehouse, will share these tips, and many more, to make you more successful as you grow your Cloudera Data Warehouse to handle more users and more use cases than ever before.

Things We Bet You Didn't Know Your Cloudera Data Warehouse Can Do!

Joydeep Das, VP Product Management, Cloudera | Santosh Kumar, Senior Product Manager, Cloudera

Both before and after the merger, Cloudera has many great capabilities available to assist you in getting the most from your Data Warehouse use cases. This session will demonstrate many capabilities, some only previously available to one distribution, now available to everyone.

The Modern Data Warehouse

David Loshin, President, Knowledge Integrity/TDWI Affiliate Analyst | David Dichmann, Director Product Marketing, Cloudera

There are numerous factors that motivate data warehouse modernization initiatives. Business data consumers have broadened their demands for faster and more reliable insight, and increasing volumes of streaming data must be properly handled in real time to inform faster decision making. Growing consumer communities with different demands require greater computational flexibility, and these data consumers have increasingly complex computational workloads. Data scientists want to leverage a wide swath of available data assets. These users want to access enterprise data assets in both their original and their “warehoused” formats. At the same time, development engineers want to simplify the ways that these communities are satisfied.

Existing on-premises data warehouses (and their data services teams) are unable to serve these demands. Increasingly, data warehouse modernization means migrating data environments and applications to the cloud. In this webinar, we explore the benefits of the cloud as a platform for data warehouse modernization.

Speed, Flexibility & Intelligence: Beyond The Traditional Data Warehouse

Carl Olofson, Research Vice President, IDC | David Dichmann, Director Product Marketing, Cloudera

In this Cloudera webinar featuring guest speaker Carl W. Olofson, Research Vice President, Data Management Software, IDC, learn that since the original data warehouse idea, what’s changed, what’s needed, and what are the business drivers for building a fully digital enterprise, including considerations for security and privacy, definitional metadata, and platform requirements.

AI, ML and Data Science

Accelerating the Journey to Industrialized AI

Justin Norman, Director of Research and Data Science Services, Cloudera Fast Forward Labs

Enterprises have now been investing in machine learning (ML) and artificial intelligence (AI) for years. The dream has always been (and still is) for business to leverage these new technologies in order to drive efficiencies, differentiation and growth, and the next opportunity is about consistently building and operationalizing new capabilities at scale. However, many companies are finding that their ML and AI initiatives aren’t quite living up to the dream, and aren’t realizing their full potential. Learn how Cloudera's portfolio of platform-integrated software and services has evolved to meet you where you are and accelerate your journey to running an AI factory to fuel the future of your business.

Introducing Cloudera Data Science Workbench 1.6 and new ML services

Hilary Mason, GM Machine Learning, Cloudera | Bethan Noble, Director Product Marketing, Cloudera | Michael Gregory, ML Field Engineering Lead, Cloudera

Please join Hilary Mason, general manager, Machine Learning, at Cloudera, together with Cloudera Machine Learning product experts for a tour of newly available products and services, including Cloudera Data Science Workbench (CDSW) 1.6, and a preview of upcoming innovations that reflect Cloudera’s ongoing commitment to help customers industrialize AI.

Production ML, Today and Tomorrow

Alex Breshears, Senior Product Manager, Cloudera

Machine Learning isn’t a single activity - it’s an end to end workflow with many phases. It’s critical that customers look at the end to end capability for developing data pipelines for ML. In this session, Alex Breshears will explore the model development life cycle including model training, to the software development discipline of moving models to production and keeping them up and monitoring overall model health - including mathematical boundaries along with the technical performance.

A Framework for Developing a Winning Data Project Portfolio

Alice Albrecht, Manager, Data Science Strategy and Advising, Cloudera

Within many organizations, data has begun to fuel many aspects of daily life, from business decisions to products functionality. In the rush to become data driven, many organizations have correctly begun by investing in their data platforms and strategies for hiring and upscaling hard to find data talent. Even with the right people and infrastructure in place, many organizations are still struggling to glean value from their hefty investment in data. This talk will address a key and often missing component of a successful data strategy: the ability to build a winning data project portfolio.

Machine Learning Model Deployment: Strategy to Implementation

Justin Norman, Director of Research and Data Science Services, Cloudera Fast Forward Labs

This talk will introduce participants to the theory and practice of machine learning in production. The talk will begin with an intro on machine learning models and data science systems and then discuss data pipelines, containerization, real-time vs. batch processing, change management and versioning.

As part of this talk, an audience will learn more about:

  • How data scientists can have the complete self-service capability to rapidly build, train, and deploy machine learning models.
  • How organizations can accelerate machine learning from research to production while preserving the flexibility and agility of data scientists and modern business use cases demand.

Applying deep learning for image analysis and transfer learning for NLP

Justin Norman, Director of Research and Data Science Services, Cloudera Fast Forward Labs | Seth Hendrickson, Research Engineer, Cloudera Fast Forward Labs | Victor Dibia, Research Engineer, Cloudera Fast Forward Labs | Grant Custer, Designer-Developer, Cloudera Fast Forward Labs | Ryan Micallef, Research Engineer, Cloudera Fast Forward Labs

Image analysis and natural language processing are among the fastest growing commercial applications of machine learning, driving efficiencies and transformation in every industry with the adoption of widely available and rapidly evolving libraries, tools and off-the-shelf solutions.

For enterprise practitioners implementing business-specific capabilities, there is more to gain with a deeper understanding of how these technologies work and the newest, most efficient methods of application. Join the Cloudera Fast Forward Labs team to hear about the latest advancements including tips you can use to get a headstart as a hands-on applied ML practitioner.

Use-case Analysis to Mitigate Value at Risk in Telecom

Dr. Waskif Masood, Data Scientist, T-Mobile Austria

In this talk Dr. Wasif Masood, Data Scientist, T-Mobile Austria, bridges the gap between the business case owners and the data scientists who eventually deliver those business cases. In his experience, there exists a noticeable gap between the two, mainly due to different set of priorities. He indulges these disparate streams of audiences by sharing two years of data science work at T-Mobile Austria (TMA) from the perspective of helping fellow data scientists to formulate their use-cases better so that the underlying business value is emphasized. He presents various data science use-cases and how they come together to give a bigger picture of mitigating value at risk. Learning from experience, he emphasizes the use of applied research by the data scientists to help speed up the attribution of their use-cases. This shows the value of an effective data-driven culture and the practices it brings along.

Generating Network Insights Using Named Entity Recognition

Akshitha Ramachandran, Director at Harvard College Consulting Group, Harvard

The speed with which information is posted online makes it difficult for analysts to identify and distill important trends. This is true for companies managing potentially harmful stories and for government analysts monitoring emergent regional developments. To help analysts on the Novetta Mission Analytics (NMA) team address this challenge, we conducted a novel analysis of open source and cloud-based Named Entity Recognition (NER) tools. NMA collects and enriches open source content and provides the analyst a customizable UI used to analyze trends across topics, regions, and news sources. To support their users, we generated a network visualization tool that allowed analysts to interpret entities and relationships in a corpus of news articles or other text datasets.

The Entity Network Visualization Tool has proved to be a useful, functional part of the NMA analyst toolkit. From an analytical standpoint it facilitates insights through visual representation of text. Key entities stand out based on their importance and prominence in the article. Important relationships come into focus based on node size and edge thickness. Another dimension reflected in our visualization is the nature of communities that entities form. This speaks to the subgroups and interrelations within a larger network.

Model Factory at ING Bank

Dor Kedem, Lead Data Scientist, ING Bank

At ING Bank, machine learning models are a key factor in making relevant engagements with our customers, empowering them to stay a step ahead in life and in business. In our efforts to make the model building process more rapid, compliant, validated and accessible to roles other than data scientists (such as data analysts or customer journey experts), we have structured it for an easy creation of propensity models.

In this talk, I will present this structure, focusing on pipelining data science models in Apache Spark. In particular, I will show how we use Apache Sqoop & Ranger to comply with GDPR, build a data science workflow on top of python and Jupyter, extend the SparkML libraries on PySpark to create custom standardizers and cross-validators, and show an in-house developed monitoring tool built on top of Elasticsearch for model evaluation.

Finally, I will describe the type of engagement analysts and customer journey experts have with the result set of the models created, and how we refine our dashboards (in IBM Cognos) accordingly.

IoT and Streaming Analytics

Data-in-Motion Technology Showcase

Vikram Makhija, GM Data in Motion, Cloudera

Introducing Cloudera Flow Management and Cloudera Edge Management - the latest additions to the Cloudera DataFlow toolset.

Kafka Power Chat Pt. 1: Challenges in Streaming Architectures

Dinesh Chandrasekhar, Director Product Marketing, Cloudera | Renu Tewari, Director, Engineering, Cloudera

Hear one of the original Kafka team members and a leader in this space, Renu Tewari explain key challenges an enterprise is bound to face with streaming architectures, how to overcome those challenges with an effective Kafka implementation, and review practical use cases and their successes and failures.

Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine Data

Timothy Spann, Field Engineer, Data in Motion, Cloudera | John Kuchmek, Solutions Engineer, Cloudera

Build and deploy ML for sentiment analysis and YOLO object detection as part of an IoT application that starts from devices collecting sensor data and camera images with MiNiFi. This data is streamed to Apache NiFi which integrates with Cloudera Data Science Workbench for classification with models in real-time as part of the real-time event stream. We parse, filter, fork, sort, query with SQL, dissect, enrich, transform, join and aggregate data as it is ingested.

The data is landed in a Cloudera data store in the cloud for batch and interactive analytics with Spark, Hive, Phoenix, HBase, Druid, Kudu and Impala.

How to Ingest 16 Billion Records Per Day into your Hadoop Environment

Uwe Weber, Senior Big Data Engineer, Telefonica Germany

Telefónica is using Cloudera DataFlow to ingest 16 billion records a day to help plan and create a better mobile network - attend to hear about their technical implementation, made scalable by Cloudera DataFlow.

Tracking Crime as it occurs with Apache Phoenix and Apache NiFi

Timothy Spann, Field Engineer, Data In Motion, Cloudera | Henry Sowell, Director, Solutions Engineering, Cloudera

Using Apache NiFi, we can ingest various sources of crime-related data in real-time while simultaneously monitoring live traffic cameras for infractions and correlation to other crime related activities - and that's only the beginning - watch this session to learn more.

Industry, Public Sector, Governance and Security

Over 520 Financial Services customers globally are using Cloudera's Enterprise Data Cloud

Steve Totman, Industry Lead for Financial Services, Cloudera

See cool global Financial Services use cases being deployed for Data Warehousing, Data Engineering, Machine Learning, and more.

Cloudera’s customers include:

  • 25/29 Globally Systemically important Banks
  • 12/15 Top Insurance Firms
  • 12/15 Top Credit Card Issuers
  • 4/5 Top Stock Exchanges
  • 8/10 Top Wealth Management Firms
  • 4/4 Top Payment Processing Companies

Apache Ranger Deep-Dive

Purnima Reddy Kuchikulla, Senior Solutions Engineer, Cloudera

This talk briefly touches on overall Security and Governance, double clicks on attribute based policies and Schedulable policies via Ranger.

Fighting Financial Crime using Data and Analytics

Dr. Richard Harmon, Managing Director, Financial Services, Cloudera | Justin Lyon, CEO & Founder, Simudyne

How T-Mobile Tamed Metron

Carolyn Duby, Senior Solutions Engineer, Cloudera | John Charlton, Cyber Security Manager, T-mobile

Smart Cities: Driving Innovation with Open Source and Open Data

Dave Shuman, Industry Leader, IoT & Manufacturing, Cloudera | Kevin Martin,

From Smart Mobility and Smart Energy to improved Public Health, Safety, & Governance – this session will discuss how cities are delivering better citizen services leveraging open source technology with a consistent governance and security framework that spans the data center and the public clouds.

Reproducing US Census Reports Using Open Source Tools

Ian Brooks, Solutions Engineer, Cloudera

The United State Census is mandated by the US Constitution with focus on of systematically acquiring and recording demographic, economic, and household information on the US population. The agency has many different survey teams who are responsible for acquiring the survey results, tabulating the results, and writing the reports. These teams currently complete their task using a collection of custom software legacy routines. 

In 2018, the US Census has introduced a Data Lake to their Enterprise for Big Data management, which is based on open source Apache Hadoop. In addition to data management software, the Apache Software Foundation includes tools designed for statistical data analysis. This talk is designed to illustrate the workflow required to migrate the Survey team's processes using publicly available data sets acquired from the Census Website.

The two Surveys used are 2012 Commodity Flow Survey (CFS) and 2008 Survey of Income and Program Participation (SIPP). According to the CFS website, the CFS produces data on the movement of goods in the United States. It provides information on commodities shipped, their value, weight, and mode of transportation, as well as the origin and destination of shipments of commodities from manufacturing, mining, wholesale, and selected retail and services establishments. SIPP's Website describes the survey as the premier source of information for income and program participation. SIPP collects data and measures change for many topics including: economic well-being, family dynamics, education, assets, health insurance, childcare, and food security.

The data sets are download and ingested into the Data Lake, which makes them available for Apache Spark. Using Data Engineering techniques, the data sets are transformed and prepared for reporting. The reports that are generated follow the step that provided in the Survey methodology. In this Talk, the work has been completed and results are displayed in an Apache Zeppelin notebook.

Digital Shift in Insurance: How is the Industry Responding with the Influx of Big Data, Analytics and AI?

Cindy Maike, Vice President, Industry Solutions, Cloudera

Data Protection in Hybrid Enterprise Data Lake Environment

Murali Ramasami, Senior Software Engineer, Cloudera | Niru Anisetti, Director of Product Management, American Express

In the current digital world, Enterprises are drowning under the weight of data that are required to store for customers, for corporate analysis, and for the business forecast. With the convergence of cloud, IoT, and big data technologies, data lakes are becoming the critical fuel for enterprise-wide digital transformations which are proven to be cost-effective, self-service with elastic in nature. This enterprise data is spread widely across numerous clusters and repositories residing in both the companies data centers and multiple cloud locations posing a new “data protection” problem in hybrid environments. Protecting data is very critical as part of every business continuity plan because data loss or corruption may have a huge impact on enterprise survival.

In this talk, we will address the challenges faced by enterprises using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.

We will outline using a unified open source orchestration platform how:

  • You can protect mission-critical data along with their security and governance policies across multiple data lakes and change data capture works using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.
  • You can monitor replication jobs and metric collections associated with the replicated data across hybrid enterprise data lake environments.

We will also showcase:

  • How to seamlessly replicate HDFS data, Hive databases between Hortonworks clusters securely along with Apache Ranger policies and Apache Atlas metadata.
  • How to securely move the data between on-premise clusters and cloud storages.

Security Framework for Multitenant Architecture

Suresh Yadagotti Jayaram, Senior Technical Architect, Florida Blue

In the healthcare sector, data security, governance, and quality are crucial for maintaining patient privacy and ensuring the highest standards of care. At Florida Blue, the leading health insurer of Florida serving over five million members, there is a multifaceted network of care providers, business users, sales agents, and other divisions relying on the same datasets to derive critical information for multiple applications across the enterprise. However, maintaining consistent data governance and security for protected health information and other extended data attributes has always been a complex challenge that did not easily accommodate the wide range of needs for Florida Blue’s many business units. Using Apache Ranger, we developed a federated Identity & Access Management (IAM) approach that allows each tenant to have their own IAM mechanism. All user groups and roles are propagated across the federation in order to determine users’ data entitlement and access authorization; this applies to all stages of the system, from the broadest tenant levels down to specific data rows and columns. We also enabled audit attributes to ensure data quality by documenting data sources, reasons for data collection, date and time of data collection, and more. In this discussion, we will outline our implementation approach, review the results, and highlight our “lessons learned.”

Computer Vision in Manufacturing with Miner & Kasch

Michael Ger, Managing Director, Manufacturing and Automotive, Cloudera | Florian Muellerklein, Data Scientist, Miner & Kasch | Adam Nepp, Vice President of Sales, Miner & Kasch

Oil & Gas improves asset profitability, portfolio management, and operations with big data

Dave Shuman, Industry Leader, IoT & Manufacturing, Cloudera

Reimagining Energy & Utilities using data, analytics, and AI

Dave Shuman, Industry Leader, IoT & Manufacturing, Cloudera


Amy O'Connor Chief Data and Information Officer, Cloudera
Arun C. Murthy Chief Product Officer, Cloudera
Justin Norman Director, Research and Data Science Services, Cloudera Fast Forward Labs
Charles Boicey Chief Innovation Officer and Cofounder, Clearsense LLC
Dave Shuman Industry Leader, IoT & Manufacturing, Cloudera
Adam Nepp Vice President of Sales, Miner & Kasch
Eva Nahari Director Product Management, Cloudera
Richard Harmon Industry Lead for Financial Services, Cloudera
Rob Thomas General Manager, IBM Data and Watson AI
Bill Inmon Founder, Chairman, CEO, Best-selling Author, Forest Rim Technology
Ron Abellera Director, Azure GBB Data Specialist for the Americas, Microsoft
Gaurav Kanade Software Engineer, Microsoft
Premchand Bellamkonda Senior Product Manager, Informatica
Sam Tawfik Principal Marketing Manager, Informatica
Siamak Sadeghianfar Principal Product Manager, Red Hat
Daniel Messer Senior Product Manager, Red Hat
Shobana Neelakantan Engineering Manager, PayPal
Steve Totman Industry Lead for Financial Services, Cloudera
Justin Lyon CEO and Founder, Simudyne
Florian Muellerklein Data Scientist, Miner & Kasch
Michael Ger Managing Director, Cloudera
Akshitha Ramachandran Director, Harvard College Consulting Group
Alice Albrecht Manager, Data Science Strategy and Advising, Cloudera
Carsten Herbe Big Data Architect, Audi Business Innovation GmbH
Cindy Maike Vice President, Industry Solutions, Cloudera
Dor Kedem Lead Data Scientist, ING Bank
Edwin Scheepstra Business Analyst, Rabobank
Dr. Wasif Masood Data Scientist, T-Mobile Austria
Bethann Noble Director Product Marketing Machine Learning, Cloudera
Nick Vaughan Domain SME - Data Analytics & Modelling, Bank of England
Sanjay Kumar Senior Director, Industry Solutions, Cloudera
Uwe Weber Senior Big Data Engineer, Telefonica Germany
Carolyn Duby Senior Solutions Engineer, Cloudera
Sanjeev Koranga Engineering Manager, PayPal
Owen O'Malley Co-founder & Technical Fellow, Cloudera
Paul Gibeault Solution Architect, Big Data, Micron Technology Inc.
Sridhar Leekkala Senior Manager II Software Engineering, Walmart
Weiwei Yang Software Engineer, Cloudera
Suresh Yadagotti Jayaram Senior Technical Architect, Florida Blue
Sunil Govindan Staff Engineer, Cloudera
Jeremy Beard Data Warehouse Field Engineer, Cloudera
Wim Stoop Senior Product Marketing Manager, Cloudera
John Charlton Cyber Security Manager, T-mobile
Murali Ramasami Senior Software Engineer, Cloudera
Niru Anisetti Director of Product Management, American Express
Richard Winter Chief Executive Officer, WinterCorp
Sridhar Ramalingam Technical Architect, Walgreens
Gupta Narayanam Technical Architect, Walgreens
Ian Brooks Solutions Engineer, Cloudera
Rohit Balasubramanian Managing Director, Deloitte
Timothy Spann Field Engineer, Data in Motion, Cloudera
Renu Tewari Director of Engineering, Cloudera
Purnima Reddy Kuchikulla Senior Solutions Engineer, Cloudera
Joydeep Das Vice President of Product Management, Cloudera
Santosh Kumar Senior Product Manager, Cloudera
Paritosh Sundriyal Software Engineer, Walmart
Jeroen Wolffensperger Solution Architect Data, Rabobank
Jason Lamb Cloud / Software Defined Infrastructure Technology Solutions Specialist, Intel
David Tareen Global Product Marketing for AI, SAS
David Dichmann Director Product Marketing, Cloudera
Greg Rahn Director Product Management, Cloudera
Sid Shaik Director Product Management, Cloudera
David Loshin President, Knowledge Integrity/TDWI Affiliate Analyst
Carl Olofson Research Vice President, IDC
Michael Gregory Machine Learning Field Engineering Lead, Cloudera
Alex Breshears Senior Product Manager, Cloudera
Ryan Micallef Research Engineer, Cloudera Fast Forward Labs
Seth Hendrickson Research Engineer, Cloudera Fast Forward Labs
Victor Dibia Research Engineer, Cloudera Fast Forward Labs
Grant Custer Designer-Developer, Cloudera Fast Forward Labs
Vikram Makhija General Manager Data in Motion, Cloudera
Dinesh Chandrasekhar Director Product Marketing, Cloudera
Henry Sowell Director Solutions Engineering, Cloudera
Dave Mariani Founder and Chief Strategy Officer, AtScale
Ram Venkatesh Vice President of Engineering, Cloudera
Krishna Maheshwari Director Product Management, Cloudera
Naren Koneru Senior Director, Engineering, Cloudera
Kevin Martin Smart City PDX Manager, City of Portland Bureau of Planning and Sustainability
John Kuchmek Solutions Engineer, Cloudera
Mark Ferman Former Oil & Gas CIO
Fred Koopmans Vice President of Product Management, Cloudera
Vidya Raman Director of Product Management, Cloudera
Joshua Robinson Founding Engineer, FlashBlade, Pure Storage
Srikanth Venkat Senior Director, Product Management, Cloudera


How a Virtual Summit Works

Step 1:

Register for the event

Click on any of the register buttons on this page. Enter your information and hit submit.

Step 2:

Watch for informational emails

They will contain information like how to login for the event, how to get the most out of this event and recently announced speakers.

Step 3:

Schedule time to watch

Mark some time in your calendar when you can be free of distractions or plan a watch party with a few other colleagues.

Step 4:

Watch the event

Click the link in the email we send or come back to this page and click watch the event.

Register to Watch Now

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.