Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Run a PIG UDF distributively over tez/map-reduce ?

Highlighted

Run a PIG UDF distributively over tez/map-reduce ?

Expert Contributor

Is there a way to run a pig udf in parallel across the cluster?

So far in yarn I'm seeing only one container being used.

I'm running Pig on tez with a Java UDF doing some heavy weight lifting.

The tuple I'm passing to the UDF is a grouped bag.

1 REPLY 1

Re: Run a PIG UDF distributively over tez/map-reduce ?

Contributor

Hi maybe this link may help under Use the Parallel Features (pig.exec.reducers.bytes.per.reducer, pig.exec.reducers.max) and the PARALLEL key word.

https://pig.apache.org/docs/r0.7.0/cookbook.html#Use+the+PARALLEL+Clause

Running it under Tez, article may help also. https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

Don't have an account?
Coming from Hortonworks? Activate your account here