What's New @ Cloudera

Find the latest Cloudera product news

3 Benefits of External IDE Connectivity, Now Available in Cloudera Data Engineering

avatar
Cloudera Employee

Accessing Apache Spark from Your Favorite IDE

Working with Apache Spark just got easier. The recent release of Cloudera Data Engineering 1.23 introduces External IDE Connectivity, powered by Spark Connect. You can now interact with Spark clusters directly from your preferred local coding environments like Jupyter Notebook, VS Code, and PyCharm. This capability helps to streamline your data engineering workflows, enabling faster development and better collaboration, all while preserving enterprise-grade security. 

This 6-minute demo will walk you through how to access Spark from your favorite IDE with Cloudera Data Engineering:

Key Benefits

External IDE Connectivity brings a new level of flexibility for data engineering, allowing users to work with remote data from their local environment via secure, automated continuous integration and continuous delivery (CI/CD) pipelining. This approach offers several advantages, including:

1. Develop Locally, Compute Seamlessly

Developers can connect to Spark from local notebooks like Jupyter or VS Code while keeping data secure and governed. External IDE Connectivity allows teams to extract data from the open data lakehouse with Spark, analyze or test it locally, and run workloads at scale—all within a unified environment. 

2. Iterative Development with Flexible CI/CD Pipelining

This capability integrates Spark workflows seamlessly into DevOps processes. With External IDE Connectivity and Git-based version control, teams can automate testing, monitor changes, and accelerate the deployment of data pipelines.

3. Hybrid, Open, and Enterprise-Ready by Design

External IDE Connectivity is available on both Cloudera Data Engineering on cloud and on premises. Cloudera Data Engineering’s built-in multi-tenancy allows teams to securely run multiple workloads while optimizing resource usage and governance. With native, best-in-class Apache Spark and Apache Iceberg integration, Cloudera Data Engineering ensures high performance, optimized total cost of ownership (TCO), and support for open data architectures. 

Ready to Learn More?