Support Questions

Find answers, ask questions, and share your expertise

Run a PIG UDF distributively over tez/map-reduce ?

Expert Contributor

Is there a way to run a pig udf in parallel across the cluster?

So far in yarn I'm seeing only one container being used.

I'm running Pig on tez with a Java UDF doing some heavy weight lifting.

The tuple I'm passing to the UDF is a grouped bag.



Hi maybe this link may help under Use the Parallel Features (pig.exec.reducers.bytes.per.reducer, pig.exec.reducers.max) and the PARALLEL key word.

Running it under Tez, article may help also.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.