X

Cloudera Tutorials

Optimize your time with detailed tutorials that clearly explain the best way to deploy, use, and manage Cloudera products. Login or register below to access all Cloudera tutorials.

Cloudera named a leader in 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems Get the report

Ready to Get Started?

 

NOTICE

 

As of January 31, 2021, this tutorial references legacy products that no longer represent Cloudera’s current product offerings.

Please visit recommended tutorials:

 

Introduction

In this tutorial, we will use the Wikipedia sample dataset of 2015 that comes with Druid after installation to store data into Druid and then query the data to answer questions.

Prerequisites

Goals and Objectives

  • Configure Druid for HDP Sandbox
  • Analyze Dataset
  • Load Batch Data
  • Writing a Druid Ingestion Spec
  • Running Druid Task
  • Querying the Data

Outline

1. Druid Concepts: Gain high level overview of how Druid stores data, queries the data and the architecture of a Druid cluster.

2. Setting Up Development Environment: Setup hostname mapping to IP address, setup Ambari admin password, turn off services not needed and turn on Druid.

3. Loading Batch Data into Druid: Learn to load batch data into Druid by submitting an ingestion task that points to your desired data file via POST request.

4. Querying Data from Druid: Learn to write JSON-based queries to answer questions about the dataset.



Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.