Support Questions

Find answers, ask questions, and share your expertise

How to register a UDF jar with Hue and Pig?

avatar
Contributor

Hi,

I am having trouble registering a UDF jar and I am hoping someone can walk me through the process. I'm a total newb with the Hdoop software stack and Cloudera.

 

I have the Cloudera 5.2 Express VM running in VirtualBox.

I have the jars in HDFS: /tmp/elephant-bird-core.4.5.jar and /tmp/elephant-bird-pig.4.5.jar

I have a pig script with 2 file resources specified; one for each of the jars above

My pig script looks like this:

register /tmp/elephant-bird-core-4.5.jar;
register /tmp/elephant-bird-pig-4.5.jar;

A = LOAD '/tmp/test.json' USING com.twitter.elephantbird.pig.jsonloader('-nestedLoad');
describe A;

 My error looks like this:

ERROR org.apache.pig.tools.grunt.Grunt - Error 101: file '/tmp/elephant-bird-core-4.5.jar' does not exist

 There are also a bunch of warning about deprecated stuff.  note: I have not done any configuration of the system other than the screens which appear when the VM first boots; asking to install all the software and create a user.

 

Any help would be greatly appreciated...  I must admit I didn't think I'd get stuck at step #1 of my exploration of Cloudera and Hadoop.

1 ACCEPTED SOLUTION

avatar
Contributor

All I can say is programming really gives me a headache sometimes.  For those who run into the same problem as I have the solution is to type the class name with the same capitalisation as is in the source code of the UDF.  The correction which gets it all working is JsonLoader instead of what I typed originally: jsonloader

 

the correct code is:

A = LOAD '/tmp/test.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS (json:map[]);

 

 

View solution in original post

3 REPLIES 3

avatar
Contributor

All I can say is programming really gives me a headache sometimes.  For those who run into the same problem as I have the solution is to type the class name with the same capitalisation as is in the source code of the UDF.  The correction which gets it all working is JsonLoader instead of what I typed originally: jsonloader

 

the correct code is:

A = LOAD '/tmp/test.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS (json:map[]);

 

 

avatar
Contributor

Also, the register commands don't seem to be necessary as long as the UDF jar's are specified as file resources.

 

avatar
Super Guru
Thanks for the feedback. We are currently preparing a new version of the
Editor which is going to make the import of UDF easiers.

Romain