Pig Accumulator in Spark

TimothySpann — Thu, 09 Jun 2016 22:26:09 GMT

are there functions out there that utilize something like the accumulator interface in Pig where the data doesn't have to stay in memory?

Re: Pig Accumulator in Spark

LesterMartin — Fri, 10 Jun 2016 21:41:22 GMT

I'm not aware of the concept of Spark's Accumulators exposed as "first-class" objects in Pig and have always advised that you would need to build a UDF for such activities if you couldn't simply get away with filtering the things to count (such as "good" records and "rejects") into separate aliases then count them up.

Here is a blog post going down the UDF path; https://dzone.com/articles/counters-apache-pig.

Good luck & I'd love to hear if there was something I've been missing all along directly from Pig.

question Re: Pig Accumulator in Spark in Archives of Support Questions (Read Only)

Pig Accumulator in Spark

Re: Pig Accumulator in Spark