Is there a way to run a pig udf in parallel across the cluster?
So far in yarn I'm seeing only one container being used.
I'm running Pig on tez with a Java UDF doing some heavy weight lifting.
The tuple I'm passing to the UDF is a grouped bag.
Hi maybe this link may help under Use the Parallel Features (pig.exec.reducers.bytes.per.reducer, pig.exec.reducers.max) and the PARALLEL key word.
Running it under Tez, article may help also. https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html