Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×

Setup

For the remainder of this tutorial, we will present examples in the context of a fictional corporation called DataCo, and our mission is to help the organization get better insight by asking bigger questions.

Scenario:

Your Management: is talking euphorically about Big Data.

You: are carefully skeptical, as it will most likely all land on your desk anyway. Alternatively it has already landed on you, with the nice project description of: Go figure this Hadoop thing out.

Preparation:

Verify your environment. Go to Cloudera Manager in your demo environment and make sure the following services are up and running (have a green status dot next to them in the Cloudera Manager HOME Status view):

  • Apache Impala (incubating) - which you will use for interactive query 
  • Apache Hive - which you will use for structure storage (i.e. tables in the Hive metastore) 
  • HUE - which you will use for end user query access 
  • HDFS - which you will use for distributed data storage 
  • YARN - processing framework used by Hive (includes MR2)

If any of the services show yellow or red, restart the service or reach out to this discussion forum for further assistance.

Starting/Restarting a service:

  1. Click on the dropdown menu to the right of the service name. 
  2. Click on Start or Restart.
  3. Wait for your service to turn to green.

Now that you have verified that your services are healthy and showing green, you can continue.