Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar

This article contains Questions & Answers on Cloudera Data Engineering (CDE).

 

Is Cloudera Data Engineering integrated with Cloudera Data Warehouse (CDW) and Cloudera Machine Learning (CML)?

Yes, through SDX generated datasets from your Data Engineering pipelines will automatically be accessible in downstream analytics like Data Warehousing and Machine Learning.  In additionally all the benefits of SDX including lineage to secure your data pipelines across your enterprise. 

 

I already have Cloudera Machine Learning (CML) that has Spark capabilities. How is this better?

Cloudera Data Engineering is tailor built for data engineers to operationalize their data pipelines. ML is tailored to data scientists who want to develop and operationalize their ML models.  Both services are fully integrated with each other and seamlessly interoperable for however you want to run your data engineering and data science workflows. Because they’re both included with CDP, there’s no extra purchase necessary to use one or the other — both are based on consumption by the hour so you only pay for what you use.

 

Is there an orchestration/scheduling tool within CDE?

Yes. We have a managed Apache Airflow scheduling and orchestration service natively in the platform. This offers superior capabilities compared to existing tools in the market, which we’ve extended even further for automation and delivery through our rich set of APIs

 

Where does a Data Engineer write code? Does CDE provide notebooks to develop pipelines as well?

CDE supports Scala, Java, and Python code.  It is flexible that any jobs you have developed in your favorite IDE locally or through 3rd party tools can be deployed through a rich job management APIs.  CDE offers CLI to submit jobs security from your local machine or using REST APIs to integrate with CI/CD workflows.  And with Cloudera Machine Learning (CML) you can develop with notebooks without leaving the CDP ecosystem and operationalize them in CDE.

676 Views