Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

piggybank jar file does not exist

Solved Go to solution
Highlighted

piggybank jar file does not exist

Hi all,

I am new to Sandbox and am trying to run Pig on Microsoft Azure.

To load one of my tables, I need to use the piggybank jar. I have downloaded this and saved it to hdfs in the path tmp/stackexchange

Here is the code I am trying to run:

REGISTER /tmp/stackexchange/piggybank.jarRAW_LOGS1 = LOAD Query_1-50000.csv USING org.apache.pig.piggybank.storage.CSVExcelStorage(',', YES_MULTILINE) as (Id:Long, PostTypeID:chararray, AcceptedAnswerID:chararray, ParentID:chararray, CreationDate:chararray, DeletionDate:chararray,  Score:long, ViewCount:long, Body:chararray, OwnerUserID:chararray, OwnerDisplayName:chararray, LastEditorUserId:chararray, LastEditorDisplayName:chararray, LastEditDate:chararray, LastActivityDate:chararray, Title:chararray, Tags:chararray, AnswerCount:int, CommentCount:int, FavoriteCount:int, ClosedDate:chararray, CommunityOwnedDate:chararray);

However, I am being returned the error message:

2016-03-20 17:22:48,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/tmp/stackexchange/piggybank.jar' does not exist.

Does anyone know what could be wrong? Am I missing a step required to register the piggybank file perhaps?

Any help is greatly appreciated - thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: piggybank jar file does not exist

I had some troubles a while back similar to this as shown at https://martin.atlassian.net/wiki/x/C4BRAQ. Try replacing

REGISTER /tmp/stackexchange/piggybank.jar

with

REGISTER 'hdfs:///tmp/stackexchange/piggybank.jar'

and let us know if that works.

View solution in original post

6 REPLIES 6
Highlighted

Re: piggybank jar file does not exist

I had some troubles a while back similar to this as shown at https://martin.atlassian.net/wiki/x/C4BRAQ. Try replacing

REGISTER /tmp/stackexchange/piggybank.jar

with

REGISTER 'hdfs:///tmp/stackexchange/piggybank.jar'

and let us know if that works.

View solution in original post

Highlighted

Re: piggybank jar file does not exist

Brilliant - that works. Thanks!

Highlighted

Re: piggybank jar file does not exist

Mentor

As long as you use HDP and you have pig client installed on your edgenode, you can find piggybank jar in /usr/hdp/current/pig-client/lib/piggybank.jar. you dont need to download it separately or upload it to hdfs.

Please see this for example https://community.hortonworks.com/questions/20487/store-output-file-as-3-files-using-pig.html

Re: piggybank jar file does not exist

Hi Artem - thanks for the response. Can you please explain "

As long as you use HDP and you have pig client installed on your edgenode"? - Is this something additional I need to do/install?

I cannot locate the folder "/usr" on hdfs.

Thanks for your help,

Maeve

Highlighted

Re: piggybank jar file does not exist

Mentor

HDP installs are placed in /usr/hdp/version, so in case you are on HDP, look for /usr/hdp on your local filesystem not in HDFS. Then in Ambari, make sure you have pig client installed on the machines you're on. Look for that jar in the directory I specified earlier. @Maeve Ryan

Highlighted

Re: piggybank jar file does not exist

Ah - understood now. This worked! Thank you :)

Don't have an account?
Coming from Hortonworks? Activate your account here