Product Announcements

Find the latest product announcements and version updates

Announcing: Cloudera Enterprise 5.2 (CDH 5.2, Cloudera Manager 5.2, Cloudera Director 1.0, and Cloudera Navigator 2.1)

avatar
Master Collaborator

We're pleased to announce the release of Cloudera Enterprise 5.2 (comprising CDH 5.2, Cloudera Manager 5.2, Cloudera Director 1.0, and Cloudera Navigator 2.1).

 

This release reflects our continuing investments in Cloudera Enterprise's main focus areas, including security, integration with the partner ecosystem, and support for the latest innovations in the open source platform (including Impala 2.0, its most significant release yet, and Apache Hive 0.13.1). It also includes a new product, Cloudera Director, that streamlines deployment and management of enterprise-grade Hadoop clusters in cloud environments; new component releases for building real-time applications; and new support for significant partner technologies like EMC Isilon. Furthermore, this release ships the first results of joint engineering with Intel, including WITH GRANT OPTION for Hive and Impala and performance optimizations for MapReduce.

 

Here are some of the highlights (incomplete; see the respective Release Notes for CDH, Cloudera Manager, and Cloudera Navigator for full lists of features and fixes):

 

Security

  • Via Apache Sentry (incubating) 1.4, GRANT and REVOKE statements in Impala and Hive can now include WITH GRANT OPTION, for delegation of granting and revoking privileges (joint work with Intel under Project Rhino).

  • Hue has a new Sentry UI that supports policy management for visually creating/editing roles in Sentry and permissions on Files in HDFS .

  • Kerberos authentication is now supported in Apache Accumulo.

  • Impala, authentication can now be done through a combination of Kerberos and LDAP.

Data Management and Governance

  • Cloudera Navigator 2.1 features a brand new auditing UI that is unified with lineage and discovery, so you now have access to all Navigator functionality from a single interface.

  • Navigator 2.1 includes role-based access control so you can restrict access to auditing, metadata and policy management capabilities

  • We’re also shipping a beta policy engine in Navigator 2.1. Targeted to GA by year-end, the policy engine allows you to set up rules and notifications so you can classify data as it arrives and integrate with data preparation and profiling tools. Try it out and let us know what you think!

  • And we’ve added lots of top-requested enhancements, such as Sentry auditing for Impala and integration with Hue.

Cloud Deployment

  • Cloudera Director is a simple and reliable way to deploy, scale, and manage Hadoop in the cloud (initially for AWS) in an enterprise-grade fashion. It’s free to download and use, and supported by default for Cloudera Enterprise customers. Features include:

  • Simple UI for self-service cluster spin up/teardown

  • Dynamic scaling for spiky workloads

  • Simple cloning of clusters

  • Cloud blueprints for repeatable deployments

  • Third-party software deployment within same workflow

  • Support for custom, workload-specific deployments

  • Support for complex cluster topologies

  • Minimum size cluster when capacity constrained

  • Multi-cluster dashboard

  • Instance tracking for account billing

Real-Time Architecture

 

  • Rebase on Apache HBase 0.98.6

    • Cell-level ACLs for fine-grained access control of data in HBase now supported

    • Backported improvements to get and put request scheduling and throttling that provide basic QoS for multi-tenant HBase tables and clusters. Lets some production and real-time workloads take priority over ad hoc and analytic jobs.

    • Backported patches that make Offheap Block Cache (aka bucket cache) production-ready. Now you can use large amounts of memory for read caching without the GC penalties of the past. Bucket cache is now the default.

    • Backported authentication of clients accessing HBase via the HBase Thrift Proxy.

  • Rebase on Apache Spark/Streaming 1.1

  • Rebase on Impala 2.0

  • Cloudera Search

    • now provides Spark-indexing - iterative, fast index design

    • distributed pivot facets

    • ability to expire documents

    • node fail recovery

    • support for deep paging and for multithreaded faceting

  • Apache Sqoop now supports import into Apache Parquet (incubating) file format

  • Apache Kafka integration with CDH is now incubating in Cloudera Labs; a Kafka-Cloudera Labs parcel (unsupported) is available for installation. Integration with Flume via special Source and Sink have been provided.

Impala 2.0

 

  • Disk-based query processing: enables large queries to "spill to disk" if their in-memory structures are larger than the currently available memory. (Note that this feature only uses disk for the portion that doesn't fit in the available memory.)

  • Greater SQL compatibility: SQL 2003 analytic (window) functions, support for legacy data types (such as CHAR and VARCHAR), better compliance with SQL standards (WHERE, EXISTS, IN), and additional vendor-specific SQL extensions.

  • Impala 2.0 is now also available for CDH 4.

New Open Source Releases and Certifications

 

Cloudera Enterprise 5.2 includes multiple new component releases:

 

  • Apache Avro 1.7.6

  • Apache Crunch 0.11

  • Apache Hadoop 2.5

  • Apache HBase 0.98.6

  • Apache Hive 0.13.1

  • Apache Parquet (incubating) 1.5 / Parquet-format 2.1.0

  • Apache Sentry (incubating) 1.4

  • Apache Spark 1.1

  • Apache Sqoop 1.4.5

  • Impala 2.0

  • Kite SDK 0.15.0

...with new certifications on:

 

  • Filesystems: EMC Isilon

  • OSs: Ubuntu 14.04 (Trusty)

  • Java: Oracle JDK1.7.0_67

 

Over the next few weeks, we’ll publish blog posts that cover some of these and other new features in detail. In the meantime:

As always, we value your feedback; please provide any comments and suggestions through our community forums. You can also file bugs via issues.cloudera.org.

0 REPLIES 0