Created on 11-08-2022 03:12 PM - edited 06-26-2023 04:59 AM
Cloudera recently announced the open-source dbt adapters for all the engines in Cloudera Data Platform (CDP)—, Apache Impala, and Apache Spark, with added support for Apache Livy and Cloudera Data Engineering.
In addition to providing the adapters, Cloudera is offering a turn-key solution to be able to manage the end-to-end software development life cycle (SDLC) of dbt models. This solution is available on all clouds, as well as on-prem deployments of CDP.
In this article, we show how our customer data teams can streamline their data transformation pipelines in the Cloudera Data Platform and deliver high-quality data that their business can trust. Our solution satisfies the stringent security and privacy requirements of our customers while providing an easy-to-use turnkey solution for practitioners.
A key advantage of using dbt is that it provides a framework for analysts to easily follow software engineering best practices for their SQL transformation pipelines. Instead of the typical ad hoc scripting resulting in brittle pipelines, analysts can leverage engineering best practices to build robust, tested, and documented pipelines that produce high-quality data sets that can be trusted by the business.
Figure 1: Software development life cycle of dbt models
As shown in Figure 1, a dbt user’s workflow typically consists of the following phases:
In order for any customer to use dbt core and the adapters to build their transformation pipelines, a lot of scaffolding needs to be available. Cloudera has identified the requirements of such a scaffolding to enable secure and simple workflows for analysts and has provided guides to bring up such a scaffolding natively within the Cloudera Data Platform.
Cloudera has provided a managed software package of dbt core and all adapters for CDP engines that is maintained and supported by Cloudera. Watch dbt working in CDP Public Cloud.
The dbt integration with CDP is brought to you by Cloudera’s Innovation Accelerator, a cross-functional team that identifies new industry trends and creates new products and partnerships that dramatically improve the lives of our Cloudera customers’ data practitioners. Learn more with Cloudera’s simple guides to deploy and run dbt in all form factors supported by Cloudera for a truly hybrid solution.
To learn more, contact us at innovation-feedback@cloudera.com.