09-28-2016 11:41 AM
Many of us Cloudera Enterprise users love Apache Pig because it makes it easy to build powerful and complex data transformation and integration pipelines that run as fault-tolerant batch jobs. However, we are afraid that Cloudera may be abandoning our beloved Pig because CDH has not kept up with any of the latest versions of Pig, which offer many new helpful capabilities. What can you tell us about Cloudera's plans for Pig?
Hortonworks HDP 2.5 contains Pig 0.16
MapR 5.2.0 contains Pig 0.15
IBM BigInsights 4.2.0 contains Pig 0.15
Cloudera CDH 5.8.0 contains Pig 0.12
10-09-2016 04:10 PM
Rebasing to a higher version of Pig is on our roadmap. Unfortunately I can't provide specific guidance as to which release this will be included in and/or when that will be happening, however work is actively in progress. Rest assured however that we haven't given up on Pig!
10-12-2016 02:47 PM
My name is Santosh and I am the Product Manager for Pig at Cloudera.
Cloudera is _not_ abandoning Pig at all. Pig is widely used among our customers. We are fully committed to supporting it.
Upstream Pig versions are not the best way to think about CDH Pig. CDH 5.8 Pig contains base v0.12 + numerous features/enhancements/fixes including some from v0.16 provided these pass our stability and quality standards. That's because our customers have consistently told us that stability and quality are their highest priority when it comes to Pig.
We have not rebased since v0.12 because v0.13 added support for pluggable compute engine (Tez) that caused a significant code churn and introduced instability. Next two releases v0.14 and v0.15 mostly stabilized Pig on Tez with very few additional features or enhancements. More recently v0.16 has been released which is a good candidate for rebase and work has already started on rebasing CDH Pig to v0.16.
To reiterate, Cloudera is fully committed to supporting Pig and making the most stable and reliable release of Pig available to them.
10-12-2016 02:55 PM - edited 10-12-2016 02:58 PM
Santosh, thank you kindly for your reply. I am sure that it is a relief for many of us to hear your good news.
Since you are the Product Manager for Pig at Cloudera, can you also please give us an update on your team's effort to make Pig run on Spark instead of MapReduce?