Your browser is out of date

Update your browser to view this website correctly. Update my browser now



This training course is designed for analysts and administrators who need to enable data governance capabilities on

Big Data. Topics include: an introduction to Data governance, an introduction to Ranger, an introduction to Atlas,

and an introduction to Data Steward Studio.

Audience & Prerequisites

This course if for analysts and administrators who need to understand how to enable data governance capabilities in HDP.

Students should be familiar with data governance and have experience with data analysis. No prior hadoop knowledge is needed.

Agenda Sumary

An introduction to Data Goverance capabilities on the HDP platform. Focus on Ranger integration with Atlas and Data Steward Studio.

Book the course

How would you like to train?

Course Contents

Introduction to Data Governance

  • Introduction to Data governance
  • Control or Uncontrolled data
  • Risks and Challenges
  • Prioritization( Critical data elements.)
  • Challenges with Data Governance
  • Overview of GDPR and Challenges

HDP Platform and Data Governance Capabilities

  • High-level Architecture for HDP and HDF
  • High-level DataPlane Services Architecture
  • High-level Atlas Architecture
  • Review of Multiple-cluster Infrastructure (Cloud, On Premise and Hybrid)
  • Review of Services Offered with DataPlane
  • Overview of Data Lifecycle Manager

Introduction to Ranger

  • Overview of Ranger and HDP 3.0
  • Ranger Groups, Users and Policy
  • Ranger Policies for Apache Atlas
  • Review of Multiple-cluster Infrastructure (Cloud, On Premise and Hybrid)
  • Review of Services Offered with DataPlane
  • Integration of Ranger and Atlas (Accessing Data Classified PII, SENSITIVE)

Introduction to Altas

  • Highlight HDP 3.0 Features
  • UI View of Data Moving Through Processes
  • Metadata Types and Instances (Data types for Hadoop and non-Hadoop metadata)
  • Classifications
  • Create Classifications and Include Attributes
  • UI to Search Entities by Type, Classification and Attribute
  • SQL Like Query Language to Search Entities
  • UI View of Data Moving Through Processes
  • Propagation of Classification via Lineage
  • Entities
  • Split Lineage
    • Rest APIs to access and update Lineage
    • Partner integrations and customizations
  • Integration with Apache Ranger ( Accessing data classified PII,SENSITIVE)
  • Security and Data Masking
  • Fine Grained Security for Metadata Access, Enabling Controls on Access to Entity Instances and Operations for Classifications

Introduction to Data Steward Studio

  • Updates of DSS with HDP 3.0
  • Dashboard and Report Capabilities in DSS
  • Catalog for Data Global Inventory
  • Data Access Security and Monitoring
  • Enrich Data in DSS( Classifications and annotations)
  • DSS Deployment Across the Enterprise

Hands-On Labs

  • Setting Up the Environment
  • Manage Users and Groups in Ranger
  • Review Ranger Audit Logs
  • Create Tag Based Policies in Ranger
  • Create Type Entities in Atlas
  • Create Classifications and Tag Entities in Atlas
  • Using Atlas REST APIs
  • Create a Data Asset Collections in Data Steward Studio
  • Data Profiler in Data Steward Studio

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.