For the remainder of this tutorial, we will present examples in the context of a fictional corporation called DataCo, and our mission is to help the organization get better insight by asking bigger questions.
Your Management: is talking euphorically about Big Data.
You: are carefully skeptical, as it will most likely all land on your desk anyway. Alternatively it has already landed on you, with the nice project description of: Go figure this Hadoop thing out.
Verify your environment. Go to Cloudera Manager in your demo environment and make sure the following services are up and running (have a green status dot next to them in the Cloudera Manager HOME Status view):
- Apache Impala - which you will use for interactive query
- Apache Hive - which you will use for structure storage (i.e. tables in the Hive metastore)
- HUE - which you will use for end user query access
- HDFS - which you will use for distributed data storage
- YARN - processing framework used by Hive (includes MR2)