dbt Core: Integrates with Cloudera’s Open Data Lakehouse

Solutions Gallery > dbt Core: Integrates with Cloudera’s Open Data Lakehouse
dbt Core: Integrates with Cloudera’s Open Data Lakehouse
Solution overview
Cloudera is proud to bring dbt to the open data lakehouse with adapters for SQL engines supported by CDP. These adapters eliminate the need for separate tools for transformation and data quality frameworks for Impala and Hive users.
Data teams and different business functions build and manage the business logic of transformation pipelines using their own processes using different engines on the same data lakehouse. There is a growing need to have a central, transparent, version-controlled repository with a consistent SDLC experience to manage these transformation pipelines across data teams and different business functions. Streamlining the SDLC has shown to speed up delivery of data projects while increasing transparency and auditability, leading to a much more data-driven organization.
dbt offers this consistent SDLC experience for transformation pipelines. dbt has become an industry wide movement where companies big and small are leveraging it to streamline their transformation pipeline management.
dbt is a popular transformation tool to build and run SQL based data transformations against a data warehouse. By utilizing the existing Cloudera platform, we have built a seamless data-transformation experience for data engineers and data analysts to collaborate on building data pipelines, bringing the business and data engineering teams together in the process of enriching structured data to feed downstream applications, BI and ML needs.

dbt provides functionality to design, develop and deploy SQL based data models and works via an adapter with an underlying SQL engine to carry out those transformations. Cloudera has built integration of dbt’s capabilities with the engines provided in CDP including Impala, Spark and Hive. Data practitioners can now simply install and configure our adapter packages along with dbt-core and begin to transform their data with dbt.
We have sample projects and tutorials to get you started with dbt adapters, and guidelines for how to leverage Cloudera Machine Learning (CML) to provide a flexible GUI to build and deploy dbt models from within the Cloudera Data Platform (CDP).
Key highlights
Category
Modernize Architecture
Faster time to value, reduction of production issues, leverages existing SQL skills
Easily and visually document the data model
Run the models to incrementally transform new data with the ability to schedule these to form operational pipelines