Support Questions

Find answers, ask questions, and share your expertise

Run a PIG UDF distributively over tez/map-reduce ?

Expert Contributor

Is there a way to run a pig udf in parallel across the cluster?

So far in yarn I'm seeing only one container being used.

I'm running Pig on tez with a Java UDF doing some heavy weight lifting.

The tuple I'm passing to the UDF is a grouped bag.

1 REPLY 1

Contributor

Hi maybe this link may help under Use the Parallel Features (pig.exec.reducers.bytes.per.reducer, pig.exec.reducers.max) and the PARALLEL key word.

https://pig.apache.org/docs/r0.7.0/cookbook.html#Use+the+PARALLEL+Clause

Running it under Tez, article may help also. https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.