- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can some one suggest good source to learn UDFs for Pig.
- Labels:
-
Apache Pig
Created ‎09-07-2016 02:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎09-07-2016 02:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What language are you interested in using for Pig UDFs? I pesonally prefer Python, as I'm most comfortable with it. However, you'll likely see better performance from Java based UDFs.
This site provides a decent overview: http://help.mortardata.com/technologies/pig/writing_python_udfs. I find I learn best from examples. Here is a link to some examples they have written: https://github.com/mortardata/mortar-examples/tree/master/udfs/python.
You may find this link helpful as well: https://www.codementor.io/data-science/tutorial/extending-hadoop-apache-pig-with-python-udfs
There is also the Apache documentation: https://pig.apache.org/docs/r0.16.0/udf.html
Created ‎09-07-2016 02:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What language are you interested in using for Pig UDFs? I pesonally prefer Python, as I'm most comfortable with it. However, you'll likely see better performance from Java based UDFs.
This site provides a decent overview: http://help.mortardata.com/technologies/pig/writing_python_udfs. I find I learn best from examples. Here is a link to some examples they have written: https://github.com/mortardata/mortar-examples/tree/master/udfs/python.
You may find this link helpful as well: https://www.codementor.io/data-science/tutorial/extending-hadoop-apache-pig-with-python-udfs
There is also the Apache documentation: https://pig.apache.org/docs/r0.16.0/udf.html
Created ‎09-07-2016 03:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the quick reply. I am using Java based UDFs for Pig. I have already gone through few of the links shared by you. Mortardata examples looks interesting, will look into them. Thanks for your support..
Created ‎09-07-2016 03:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm glad to help. If you run across any other links you find helpful, come back and share them!
Created ‎09-08-2016 06:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Michael, Thank for sharing useful links.
Created ‎09-08-2016 06:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Arun,
If you are using java to write UDF please follow the below steps it will be easy and helpful.
To write UDF you have to follow three main steps.
- Write UDF program and compile it and create jar file.
- Register that .jar file using Register command in Pig.
- Define .jar files in your Pig script using Define command.
public class Sample_Eval extends EvalFunc<String>{
public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return null;
String str = (String)input.get(0);
return str.toUpperCase();
}
}
This is just a sample exaple of UDF program in java. In the above example, I have return the code to convert the contents of the given column to uppercase. Like this you can write your own UDF just using your custom code.
Created ‎09-08-2016 01:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your suggestions. I am going through data available online. Soon will consolidate useful links and post it here.
Created ‎10-05-2016 06:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Everyone,
Sharing few of the useful links which i came across,
http://hortonworks.com/hadoop-tutorial/how-to-use-basic-pig-commands/
http://events.linuxfoundation.org/sites/events/files/slides/Pig_for_DataScience_0.pdf
https://www.pluralsight.com/courses/pig-latin-getting-started
http://chimera.labs.oreilly.com/books/1234000001811/ch05.html#registering_udfs
Created ‎10-05-2016 02:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I got a quick tutorial at https://martin.atlassian.net/wiki/x/C4BRAQ if it might help.
