Cloudera Tutorials

Optimize your time with detailed tutorials that clearly explain the best way to deploy, use, and manage Cloudera products. Login or register below to access all Cloudera tutorials.

ClouderaNOW24     See the latest Cloudera Innovations

Watch now


Cloudera Data Platform (CDP) leverages the best tools for data security and governance - Apache Atlas and Apache Ranger. Administrators can easily define security policies based on Atlas metadata tags and apply a security policy in real-time to the entire hierarchy of entities, including databases, tables, and columns.

You will learn how to classify your data, who can access the data and how to mask the data.




  • Have access to Cloudera Data Platform (CDP) Public Cloud
  • Be familiar with Cloudera Essentials for CDP (A Tour of the CDP User Interface)





There are two ways to watch the tutorial-video:

  1. Watch entire video using the link below
  2. Watch section-by-section using the link under each related section


Environment Setup


Our environment consists of

  • One Hive table (employee_data) - focus on salary column.
  • Three users (your environment will be different)
    • gdeleon, administrator; associated with group cdp_sandbox-default
    • joe_analyst, user; not associated with any group
    • ivanna_eu_hr, user; not associated with any group


Let's begin:
Select Data Warehouse from Cloudera Data Platform (CDP) home page



Open DAS by first locating your virtual warehouse, then:

  1. Click on 
  2. Open DAS


From Data Analytics Studio (DAS):

  1. Click Compose
  2. Enter the following code in the Worksheet
CREATE TABLE IF NOT EXISTS dbgr.employee_data (
  id INT,
  first_name  STRING,
  last_name   STRING,
  email       STRING,
  title       STRING,
  salary      DECIMAL(10,2)

INSERT INTO dbgr.employee_data
 struct(1  ,  "Patty"     ,  "Harvison"   ,  "PattyHarvison@somewhere.com"     ,  "Accountant I"              ,  48532.04)
,struct(2  ,  "Abbey"     ,  "Ledingham"  ,  "AbbeyLedingham@somewhere.com"    ,  "Marketing Assistant"       ,  58700.35)
,struct(3  ,  "Tricia"    ,  "Budgey"     ,  "TriciaBudgey@somewhere.com"      ,  "Nuclear Power Engineer"    ,  48081.25)
,struct(4  ,  "Saraann"   ,  "Corwin"     ,  "SaraannCorwin@somewhere.com"     ,  "Professor"                 ,  49246.32)
,struct(5  ,  "Reese"     ,  "Bownes"     ,  "ReeseBownes@somewhere.com"       ,  "Marketing Manager"         ,  70615.84)
,struct(6  ,  "Jennee"    ,  "Hawson"     ,  "JenneeHawson@somewhere.com"      ,  "Clinical Specialist"       ,  61017.10)
,struct(7  ,  "Malinde"   ,  "Kabsch"     ,  "MalindeKabsch@somewhere.com"     ,  "Developer I"               ,  48767.52)
,struct(8  ,  "Darline"   ,  "Wagstaffe"  ,  "DarlineWagstaffe@somewhere.com"  ,  "Quality Engineer"          ,  61330.88)
,struct(9  ,  "Rhona"     ,  "Damarell"   ,  "RhonaDamarell@somewhere.com"     ,  "Legal Assistant"           ,  42030.92)
,struct(10 ,  "Dagmar"    ,  "Sandom"     ,  "DagmarSandom@somewhere.com"      ,  "Staff Scientist"           ,  74302.82)
,struct(11 ,  "Debora"    ,  "Bielfelt"   ,  "DeboraBielfelt@somewhere.com"    ,  "Assistant Media Planner"   ,  59329.91)
,struct(12 ,  "Yule"      ,  "Morigan"    ,  "YuleMorigan@somewhere.com"       ,  "Systems Administrator II"  ,  72053.94)
,struct(13 ,  "Clarette"  ,  "Naptine"    ,  "ClaretteNaptine@somewhere.com"   ,  "GIS Technical Architect"   ,  74593.99)
,struct(14 ,  "Leonard"   ,  "Petrik"     ,  "LeonardPetrik@somewhere.com"     ,  "Financial Analyst"         ,  49876.08)
,struct(15 ,  "Colver"    ,  "Scudamore"  ,  "ColverScudamore@somewhere.com"   ,  "Media Manager IV"          ,  55048.58)



Have each user (gdeleon, joe_analyst and ivanna_eu_hr) run the query below. It should be successful for everyone.

SELECT * FROM dbgr.employee_data;


Create Classification (Atlas)


Open Atlas for your tenant:

Beginning from CDP home page > Data Warehouse:

  1. Click on Overview
  2. Search for your Database Catalog
  3. Click on 
  4. Open Atlas


Let's create a new classification:

  2. Select PLUS symbol


Create a new classification, sensitive, with the following attributes:

  1. Name sensitive
  2. Description holds sensitive data


Search for the table we want to assign this new classification.

Use the following search criteria:

  1. Basic search
  2. Search By Type hive_table
  3. Search By Text employee_data
  4. Click on Search
  5. Click on table name - employee_data


Let's assign our new classification, sensitive, to column salary:

  1. Click on Schema
  2. Click on + sign, next to column salary
  3. Select sensitive and Propagate option
  4. Click Add


Create Tag Based Policy (Ranger)

Open Ranger for your tenant:

Beginning from CDP home page > Data Warehouse:

  1. Click on Overview
  2. Search for your Database Catalog
  3. Click on 
  4. Open Ranger


Let's create a tag-based policy, also known as, Access-Based Attribute Control (ABAC).

  1. Click on Access Manager
  2. Select Tag Based Policies
  3. Click on cm_tag to edit existing service

Note: Your service name may be different from ours.



We have two policy types to choose from: Access and Masking. Let's look at both.


Access Policy


Access policies allow us to place restrictions on data columns that are specially marked. In this example, we will restrict our sensitive classified columns only to users in group cdp_sandbox-default and joe_analyst. No one else should be able to access or read data marked as sensitive.

Select Access tab, then Add New Policy.


Add a new policy using:

  1. Policy Type Access
  2. Policy Name sensitive_access
  3. TAG sensitive
  4. Description access to sensitive classified columns
  5. Audit Logging YES
  6. enabled
  7. Allow Conditions #1: > Select Group > cdp_sandbox-default
  8. Allow Conditions #1: > Component Permissions > hive(all permissions)
  9. Allow Conditions #2: > Select User > joe_analyst
  10. Allow Conditions #2: > Component Permissions > hive(only select permissions)
  11. Deny All Other Accesses True
  12. click on Add


Have each user (gdeleon, joe_analyst and ivanna_eu_hr) re-run the query below.

SELECT * FROM dbgr.employee_data;

User gdeleon belongs to group cdp_sandbox-default, therefore it successfully ran.
User joe_analyst was explicitly given select access, therefore it successfully ran.


It failed for ivanna_eu_hr - Permission denied: user [ivanna_eu_hr] does not have [SELECT] privilege. This user does not belong to group cdp_sandbox-default nor was given select access.
Using the select statement below, let's modify the query by removing the sensitive column (salary); statement now runs successfully.

select id,first_name,last_name,email,title from dbgr.employee_data;


Knowledge growth questions/problems:

  • Disable/Enable the policy, what happens?
  • Modify the policy to allow ivanna_eu_hr select privileges


Masking Policy


We are going place viewing restrictions on our sensitive classified columns. Although a user may have access to the sensitive data, we may want mask the real data.

Only users in group cdp_sandbox-default should see real data. All others should see masked data.

Select Masking tab, then Add New Policy.


Add a new policy using:

  1. Policy Type Masking
  2. Policy Name sensitive_masking
  3. TAG sensitive
  4. Description mask sensitive data
  5. Audit Logging YES
  6. enabled
  7. Mask Conditions #1: > Select Group > cdp_sandbox-default
  8. Mask Conditions #1: > Access Types > hive(select)
  9. Mask Conditions #1: > Select Masking Option > Unmasked(retain original value)
  10. Mask Conditions #2: > Select User > joe_analyst
  11. Mask Conditions #2: > Access Types > hive(select)
  12. Mask Conditions #2: > Select Masking Option > Nullify
  13. Click on Add


Have user (gdeleon) re-run the query below. It runs successfully - showing all data; no masking.

SELECT * FROM dbgr.employee_data;


Have user (joe_analyst) re-run the query below. It runs successfully. However, salary data is masked with nulls.

SELECT * FROM dbgr.employee_data;


Knowledge growth questions/problems:

  • Disable/Enable the mask policy, what happens?
  • Modify the masking policy to conceal data with a different option, other than nulls.



Great job! You have learned to classify your data, created an access policy to restrict access and created a masking policy to preventing users for seeing sensitive data.


Further Reading

Visit Cloudera's Collections-SDX library of videos. They provide a great overview of Cloudera's Shared Data Experience (SDX). Here are two that related to this tutorial:


Cloudera OnDemand provides world-class training - anywhere, anytime.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.