Exercise 2: Correlate structured data with unstructured data
Well, in our example with DataCo, once these odd findings are presented to your manager, it is immediately escalated. Eventually, someone figures out that on that view page, where most visitors stopped, the sales path of the product had a typo in the price for the item. Once the typo was fixed, and a correct price was displayed, the sales for that SKU started to rapidly increase.
If you had lacked an efficient and interactive tool enabling analytics on high-volume semi-structured data, this loss of revenue would have been missed for a long time. There is risk of loss if an organization looks for answers within partial data. Correlating two data sets for the same business question showed value, and being able to do so within the same platform made life easier for you and for the organization.
If you'd like to dive deeper into Hive, Impala, and other tools for data analysis in Cloudera's platform, you may be interested in Data Analyst Training.
For now, we'll explore some different techniques.