Support Questions

Find answers, ask questions, and share your expertise

Can some one suggest good source to learn UDFs for Pig.

avatar
Contributor
 
1 ACCEPTED SOLUTION

avatar
Super Guru

@Arun

What language are you interested in using for Pig UDFs? I pesonally prefer Python, as I'm most comfortable with it. However, you'll likely see better performance from Java based UDFs.

This site provides a decent overview: http://help.mortardata.com/technologies/pig/writing_python_udfs. I find I learn best from examples. Here is a link to some examples they have written: https://github.com/mortardata/mortar-examples/tree/master/udfs/python.

You may find this link helpful as well: https://www.codementor.io/data-science/tutorial/extending-hadoop-apache-pig-with-python-udfs

There is also the Apache documentation: https://pig.apache.org/docs/r0.16.0/udf.html

View solution in original post

8 REPLIES 8

avatar
Super Guru

@Arun

What language are you interested in using for Pig UDFs? I pesonally prefer Python, as I'm most comfortable with it. However, you'll likely see better performance from Java based UDFs.

This site provides a decent overview: http://help.mortardata.com/technologies/pig/writing_python_udfs. I find I learn best from examples. Here is a link to some examples they have written: https://github.com/mortardata/mortar-examples/tree/master/udfs/python.

You may find this link helpful as well: https://www.codementor.io/data-science/tutorial/extending-hadoop-apache-pig-with-python-udfs

There is also the Apache documentation: https://pig.apache.org/docs/r0.16.0/udf.html

avatar
Contributor
@Michael Young

Thanks for the quick reply. I am using Java based UDFs for Pig. I have already gone through few of the links shared by you. Mortardata examples looks interesting, will look into them. Thanks for your support..

avatar
Super Guru

@Arun

I'm glad to help. If you run across any other links you find helpful, come back and share them!

avatar
Super Collaborator

Hi Michael, Thank for sharing useful links.

avatar
Super Collaborator

Hi Arun,

If you are using java to write UDF please follow the below steps it will be easy and helpful.

To write UDF you have to follow three main steps.

  1. Write UDF program and compile it and create jar file.
  2. Register that .jar file using Register command in Pig.
  3. Define .jar files in your Pig script using Define command.

public class Sample_Eval extends EvalFunc<String>{

public String exec(Tuple input) throws IOException {

if (input == null || input.size() == 0)

return null;

String str = (String)input.get(0);

return str.toUpperCase();

}

}

This is just a sample exaple of UDF program in java. In the above example, I have return the code to convert the contents of the given column to uppercase. Like this you can write your own UDF just using your custom code.

avatar
Contributor

@Mahesh Mallikarjunappa

Thanks for your suggestions. I am going through data available online. Soon will consolidate useful links and post it here.

avatar

I got a quick tutorial at https://martin.atlassian.net/wiki/x/C4BRAQ if it might help.