Cloudera Cloudera

Register Now

Date: On Demand Time: Now

Data Modeling is a key aspect of data management for any large data warehousing project as it creates the blueprint for implementing a data warehouse for the data analysts, ETL and BI teams to follow. Given the explosion of data stored and processed, the speed and performance of data processing and access queries are heavily dependent on how the data is modeled both logically and physically. Physical modeling for Hadoop must also take into account the multiple storages and query engines in the Hadoop ecosystem to select from. In this article, we cover the best practices for data modeling in Hadoop.


Data Warehouse Field Engineer

Manish Maheshwari


Manish Maheshwari is a data architect and data scientist at Cloudera. Manish has 13+ years of experience building extremely large data warehouses and analytical solutions. He’s worked extensively on Hadoop, DI and BI tools, data mining and forecasting, data modeling, master and metadata management, and dashboard tools and is proficient in Hadoop, SAS, R, Informatica, Teradata, and Qlikview.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.