Product Announcements

Find the latest product announcements and version updates
Celebrating as our community reaches 100,000 members! Thank you!

Announcing: General Availability of Spark

Master Collaborator

Cloudera is pleased to announce the immediate availability of its first release of Apache Spark for Cloudera Enterprise (comprising CDH and Cloudera Manager).


Spark was created and contributed to the Apache Software Foundation by UC Berkeley, and it has quickly gained adoption for machine learning, interactive analytics, and streaming analytics over large datasets. It features a general programming model for writing applications by composing arbitrary operators, such as mappers, reducers, joins, group-bys, and filters. Spark keeps track of the data that each of the operators produces, enabling applications to reliably store this data in memory, which makes it ideal for low-latency computations and efficient iterative algorithms. Spark applications can be up to 100x faster and require writing 2x to 10x less code than equivalent MapReduce applications. 


Cloudera provides enterprise support for Spark through Cloudera Enterprise Flex Edition (as an optional component) and Data Hub Edition (as an incl.... This release provides Spark 0.9.0 tested for use with Spark Standalone Mode on CDH 4, from 4.4.0 forward. Expect releases for Cloudera Enterprise 5 (comprising CDH 5 and Cloudera Manager 5) and Spark on YARN in the near future.


To get started now, you can follow these instructions to install Spark using parcels with Cloudera Manager. The instructions will also walk you through the basic configuration, and a simple WordCount example on Spark.


Once you get going, we would love to hear your feedback: