Support Questions
Find answers, ask questions, and share your expertise

piggybank jar file does not exist

Solved Go to solution
Highlighted

piggybank jar file does not exist

Hi all,

I am new to Sandbox and am trying to run Pig on Microsoft Azure.

To load one of my tables, I need to use the piggybank jar. I have downloaded this and saved it to hdfs in the path tmp/stackexchange

Here is the code I am trying to run:

REGISTER /tmp/stackexchange/piggybank.jarRAW_LOGS1 = LOAD Query_1-50000.csv USING org.apache.pig.piggybank.storage.CSVExcelStorage(',', YES_MULTILINE) as (Id:Long, PostTypeID:chararray, AcceptedAnswerID:chararray, ParentID:chararray, CreationDate:chararray, DeletionDate:chararray,  Score:long, ViewCount:long, Body:chararray, OwnerUserID:chararray, OwnerDisplayName:chararray, LastEditorUserId:chararray, LastEditorDisplayName:chararray, LastEditDate:chararray, LastActivityDate:chararray, Title:chararray, Tags:chararray, AnswerCount:int, CommentCount:int, FavoriteCount:int, ClosedDate:chararray, CommunityOwnedDate:chararray);

However, I am being returned the error message:

2016-03-20 17:22:48,506 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/tmp/stackexchange/piggybank.jar' does not exist.

Does anyone know what could be wrong? Am I missing a step required to register the piggybank file perhaps?

Any help is greatly appreciated - thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: piggybank jar file does not exist

I had some troubles a while back similar to this as shown at https://martin.atlassian.net/wiki/x/C4BRAQ. Try replacing

REGISTER /tmp/stackexchange/piggybank.jar

with

REGISTER 'hdfs:///tmp/stackexchange/piggybank.jar'

and let us know if that works.

View solution in original post

6 REPLIES 6
Highlighted

Re: piggybank jar file does not exist

I had some troubles a while back similar to this as shown at https://martin.atlassian.net/wiki/x/C4BRAQ. Try replacing

REGISTER /tmp/stackexchange/piggybank.jar

with

REGISTER 'hdfs:///tmp/stackexchange/piggybank.jar'

and let us know if that works.

View solution in original post

Highlighted

Re: piggybank jar file does not exist

Brilliant - that works. Thanks!

Highlighted

Re: piggybank jar file does not exist

Mentor

As long as you use HDP and you have pig client installed on your edgenode, you can find piggybank jar in /usr/hdp/current/pig-client/lib/piggybank.jar. you dont need to download it separately or upload it to hdfs.

Please see this for example https://community.hortonworks.com/questions/20487/store-output-file-as-3-files-using-pig.html

Highlighted

Re: piggybank jar file does not exist

Hi Artem - thanks for the response. Can you please explain "

As long as you use HDP and you have pig client installed on your edgenode"? - Is this something additional I need to do/install?

I cannot locate the folder "/usr" on hdfs.

Thanks for your help,

Maeve

Re: piggybank jar file does not exist

Mentor

HDP installs are placed in /usr/hdp/version, so in case you are on HDP, look for /usr/hdp on your local filesystem not in HDFS. Then in Ambari, make sure you have pig client installed on the machines you're on. Look for that jar in the directory I specified earlier. @Maeve Ryan

Highlighted

Re: piggybank jar file does not exist

Ah - understood now. This worked! Thank you :)

Don't have an account?