Users of Cloudera Data Engineering (CDE) on CDP Public Cloud can now deploy, monitor, and schedule data pipelines on Microsoft Azure. Customers can take advantage of a fully managed Spark-on-Kubernetes with autoscaling compute and guard rails to control cost, without the usual platform management overhead.
As opposed to traditional job submissions mechanisms that require direct access to the cluster via edge nodes, data engineers can deploy data pipelines to autoscaling Virtual Clusters through simple, browser-based UI wizards or a full-fledged CLI and API. Once deployed, CDE optimizes execution through performance metric profiling and real-time monitoring, and provides users with a comprehensive view of their pipelines. And for further operationalization, a managed Apache Airflow service can be used to orchestrate complex pipelines on a schedule or based on event triggers.
Future releases, will add support for Spot and SSD instances, as well as Private Link. To get started, visit the documentation. Supported Azure regions can be found here.