Options
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Contributor
Created on 09-09-2022 09:30 PM - edited 06-26-2023 05:14 AM
This week’s release includes:
Adapters & docker images:
- dbt-spark-livy - 1.3.1 (Dec 5th, 2022)
- dbt-spark-cde - 1.3.0 (Dec 5th, 2022)
- dbt-hive - 1.3.1 (Dec 5th, 2022)
- dbt-impala - 1.3.1 (Dec 5th, 2022)
For CML/CDSW deployment
- public.ecr.aws/d7w2o6p0/dbt-cml:1.2.0 (with Jupyter Interface):
Name
Version
dbt-core
1.2.0
dbt-impala
1.2.0
dbt-hive
1.2.0
dbt-spark-cde
1.2.0
dbt-spark-livy
1.2.0
CML Base image
ml-runtime-jupyterlab-python3.9-standard:2022.04.1-b6
.py scripts (Utility)
n/a
- public.ecr.aws/d7w2o6p0/dbt-cdsw:1.2.0 (with workbench editor):
Name
Version
dbt-core
1.2.0
dbt-impala
1.2.0
dbt-hive
1.2.0
dbt-spark-cde
1.2.0
dbt-spark-livy
1.2.0
CML Base image
ml-runtime-workbench-python3.9-standard:2022.04.1-b6
.py scripts (Utility)
n/a
Note: Both CML and CDSW docker image works with CDSW, though later one only support workbench editor while setting up jobs.
Supported infrastructure:
- All adapters we have released support CDP Public Cloud LDAP with Knox
- Our Impala and Hive adapters support CDP Private Cloud with Kerberos, we are testing our Spark adapters for the same
- Both Impala and hive adapters support Local Server without authentication
Deployment Options:
Form Factor | On-Prem | Cloud | Cloud | Cloud | Cloud | |
PvC Data Services | CDPOne | CDP PaaS Data Services | CDP PaaS Data Services | CDP PaaS Datahub | ||
dbt SDLC requirements | CDSW | CML | CML | CDE | CML | |
Tested artifacts | dbt core and adapters | Custom runtime | Custom Runtime | Custom Runtime | Pypi packages | Custom Runtime |
Authoring/testing | dbt develop | CDSW workbench session | CML Jupyter Session | CML Jupyter Session | VSCode/Other IDE | CML Jupyter Session |
Orchestration | dbt run | CDSW job | CML jobs | CML jobs | Compile dbt models to airflow dag OR Run dbt run as custom bash operator | CML jobs |
Collaboration | dbt doc serve | CDSW App | CML App | CML App | Flask server OR S3 | CML App |
Past adapter releases:
dbt-hive adapter
- 1.3.0 (Nov 24th, 2022)
- 1.2.0 (Nov 4th, 2022)
- Now dbt-hive adapter supports dbt core 1.2.0.
- 1.1.5 (Oct 28th, 2022)
- 1.1.4 (Sep 23rd, 2022)
- Adding support for Kerberos auth mechanism. Along with an updated instrumentation package.
- Adding support for Kerberos auth mechanism. Along with an updated instrumentation package.
- 1.1.3 (Sep 9th, 2022)
- Added a macro to detect the hive version, to determine if the incremental merge is supported by the warehouse.
- Added a macro to detect the hive version, to determine if the incremental merge is supported by the warehouse.
- 1.1.2 (Sep 2nd, 2022)
- dbt seeds command won't add additional quotes to string, which was a known bug in the previous release. All warehouse properties(Cluster_by, Comment, external table, incremental materialization methods, etc) are tested and should be working smoothly with the adapter. Added instrumentation to the adapter
- dbt seeds command won't add additional quotes to string, which was a known bug in the previous release. All warehouse properties(Cluster_by, Comment, external table, incremental materialization methods, etc) are tested and should be working smoothly with the adapter. Added instrumentation to the adapter
- 1.1.1 (August 23rd, 2022)
- Cloudera released the first version of the dbt-hive adapter
dbt-impala adapter
- 1.3.0 (Nov 18th, 2022)
- 1.2.0 (Nov 2nd, 2022)
- Now dbt-impala adapter supports dbt core 1.2.0
- 1.1.5 (Oct 28th, 2022)
- 1.1.4 (Sep 30th, 2022)
- Now any dbt profiles errors or connection issues using dbt commands will show a user-friendly message for dbt-impala adapter. Added user-agent string to improve instrumentation
- Now any dbt profiles errors or connection issues using dbt commands will show a user-friendly message for dbt-impala adapter. Added user-agent string to improve instrumentation
- 1.1.3 (Sep 17th, 2022)
- Adding support for append mode when partition_by clause is used. Along with an updated instrumentation package.
- Adding support for append mode when partition_by clause is used. Along with an updated instrumentation package.
- 1.1.2 (Aug 5th, 2022)
- Now dbname in profile.yml file is optional; Updated a dependency in README; dbt-core version updates automatically in setup.py
- Now dbname in profile.yml file is optional; Updated a dependency in README; dbt-core version updates automatically in setup.py
- 1.1.1 (Jul 16th, 2022)
- Bug fixes for a specific function
- 1.1.0 (Jun 9th, 2022)
- Adapter migration to dbt-core-1.1.0; added time-out for snowplow endpoint to handle air-gapped env
- Adapter migration to dbt-core-1.1.0; added time-out for snowplow endpoint to handle air-gapped env
- 1.0.6 (May 23rd, 2022)
- Added support to insert_overwrite mode for incremental models and added instrumentation to the adapter
- Added support to insert_overwrite mode for incremental models and added instrumentation to the adapter
- 1.0.5 (Apr 29th, 2022)
- Added support to an EXTERNAL clause with table materialization & improved error handling for relation macros
- Added support to an EXTERNAL clause with table materialization & improved error handling for relation macros
- 1.0.4 (Apr 1st, 2022)
- Added support to Kerberos authentication method and dbt-docs
- Added support to Kerberos authentication method and dbt-docs
- 1.0.1 (Mar 25th, 2022)
- Cloudera released the first version of the dbt-impala adapter
dbt-spark-cde adapter
- 1.2.0 (Nov 14th, 2022)
- 1.1.7 (Oct 28th, 2022)
- 1.1.6 (Oct 18th, 2022)
- Added way to switch on/off the SSL certificate verification for CDE endpoint. Along with updated instrumentation package.
- Added way to switch on/off the SSL certificate verification for CDE endpoint. Along with updated instrumentation package.
- 1.1.5 ( Oct 15th, 2022)
- 1.1.4 (Sep 23rd, 2022)
- During internal testing, we came across an issue where the second run of the incremental model was failing. We have fixed that issue.
- For improved debugability, if a CDE job fails adapter will create a new log file inside the dbt log folder which contains the stderr output. A sample file looks like this: dbt-job-1663938116617-00000255.stderr.log
- 1.1.3 (Sep 10th, 2022)
- The detail for each query is now available in the logs:
dbt.log - Spark CDE session parameters can be provided via dbt's profile.yml file as key: value pair
- Any dbt profiles errors or connection issues using dbt commands will show a user-friendly message:
- 1.1.2 (Sep 2nd, 2022)
- Time out is added while polling job status to save resource hogging and code is clean and if enabled, spark events can also be seen with a new method
- Time out is added while polling job status to save resource hogging and code is clean and if enabled, spark events can also be seen with a new method
- 1.1.1 (Aug 26th, 2022)
- Improved debugging process to track JobId, Query, and session time. Access stderr and stdout of CDE jobs in dbt logs.
- Improved debugging process to track JobId, Query, and session time. Access stderr and stdout of CDE jobs in dbt logs.
- 1.1.0 (Jul 21st, 2022)
- Cloudera released the first version of the dbt-spark-cde adapter that supports connection to Cloudera Data Engineering backend using CDE APIs
dbt-spark-livy adapter
- 1.3.0 (Nov 21st, 2022)
- 1.2.0 (Nov 9th, 2022)
- 1.1.8 (Oct 28th, 2022)
- 1.1.7 (Oct 18th, 2022)
- Added way to switch on/off the SSL certificate verification for Livy endpoint.
- Added way to switch on/off the SSL certificate verification for Livy endpoint.
- 1.1.6 (Oct 15th, 2022)
- 1.1.5 (Sep 30th, 2022)
- Added Kerberos support:
- Along with an updated instrumentation package.
- Added Kerberos support:
- 1.1.4 (Sep 17th, 2022)
- Now any dbt profiles errors or connection issues using dbt commands will show a user-friendly message for dbt-spark adapters
- Spark session parameters can be provided via dbt's profile.yml file as key: value pair for dbt-spark adapters
- Now any dbt profiles errors or connection issues using dbt commands will show a user-friendly message for dbt-spark adapters
- 1.1.3 (Jul 29th, 2022)
- Added instrumentation to the adapter and updated Setup.py as per upstream
- Added instrumentation to the adapter and updated Setup.py as per upstream
- 1.1.2 (Jul 1st, 2022)
- Bug fixes to show an error when SQL model has some issue
- Bug fixes to show an error when SQL model has some issue
- 1.1.1 (Jul 1st, 2022)
- Instructions for IDBroker Mappings in the ReadMe file and some minor changes to the setup and version files
- Instructions for IDBroker Mappings in the ReadMe file and some minor changes to the setup and version files
- 1.1.0 (Jun 17th, 2022)
- Cloudera released the first version of the dbt-Spark-livy adapter to support Livy-based connection to the Cloudera Data Platform
Available resources:
Articles:
- Running dbt core with adapters for Hive, Spark, and Impala within CDP Public Cloud
- Running dbt core with adapters for Hive, Spark, and Impala within CDP Private Cloud
- Quick start guides for specific adapters
Bundled offering for CML & CDSW deployment:
GitHub repository:
- Source Code:
- A dbt adapter for Apache Impala & Cloudera Data Platform
- A dbt adapter for Apache Hive & Cloudera Data Platform
- A dbt adapter for Apache Spark with CDE API support
- A dbt adapter for Apache Spark-livy
- Sample project:
- A sample project for the dbt-impala adapter with Cloudera Data Platform
- A sample project for the dbt-hive adapter with Cloudera Data Platform
- A sample project for the dbt-spark adapter with CDE API support on the Cloudera Data Platform
- A sample project for the dbt-spark-livy adapter on the Cloudera Data Platform