Reply
New Contributor
Posts: 5
Registered: ‎02-07-2018
Accepted Solution

Running Pig scripts from HUE editor - Job gets killed

[ Edited ]

Hi,

I am new to hdfs/pig and need quick help. I just installed the cloudera quickstart VM using VMWare.

On running this script(the upper text sample) - 

 

data = LOAD '/home/cloudera/midsummer.txt' as (text:CHARARRAY);
upper_case = FOREACH data GENERATE org.apache.pig.piggybank.evaluation.string.UPPER(text);
STORE upper_case INTO '/home/cloudera/midsummer2.txt';

 

It takes 3 to 4 minutes and runs the map part and does not do the reduce. I see the following error on the job workflow page:-

Cannot access: /user/hue/oozie/workspaces/hue-oozie-1452553957.19/${wf:appPath()}/pig-b118.pig/. Note: you are a Hue admin but not a HDFS superuser, "hdfs" or part of HDFS supergroup, "supergroup".

InvalidPathException: Invalid path name Invalid file name: /user/hue/oozie/workspaces/hue-oozie-1452553957.19/${wf:appPath()}/pig-b118.pig (error 400)

 

I logged on to Hue using the cloudera user and on seeing this error also gave hadoop group to cloudera as well. From Oozie I see the error - JA0189 - Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]

 

Your help will be appreciated.

Thanks 

Regards

Santhosh

 

New Contributor
Posts: 5
Registered: ‎02-07-2018

Re: Running Pig scripts from HUE editor - Job gets killed

We would like to consider cloudera for development and ultimate commercial use if I can get over these initial hurdles. Just to add - I just want to run a simple pig script using the HUE editor on a newly installed cloudera quickstart VM.

- Should I login into HUE as cloudera?

- Any other additonal configuration or software needs to be upgraded?

 

New Contributor
Posts: 5
Registered: ‎02-07-2018

Re: Running Pig scripts from HUE editor - Job gets killed

[ Edited ]

Update:- After some testing with a small data set. The following works (without the STORE clause):-

data = LOAD '/home/cloudera/smallfile.txt' as (text:CHARARRAY);
upper_case = FOREACH data GENERATE UPPER(text);

 

But once we add the STORE or the DUMP clause at the end of the script is when those errors are thrown. It is probably pointing to some permission issues between HUE and HDFS in the "reduce" process I guess.

 

I was hoping the cloudera quick start VM will work out of the box. Calling on the cloudera experts to help!!

Please point me to any knowledge base or workarounds to avoid this problem.

New Contributor
Posts: 5
Registered: ‎02-07-2018

Re: Running Pig scripts from HUE editor - Job gets killed

Looks like these cloudera community help requests go into a black hole!!. I was able to find a solution myself. The following script works and was good enough for this trail. It appears that specifiying the directory paths in the pig script was the problem. I put all the scripts in my local directory and also told pig/hadoop to create to create the output directory under the current working directory.

 

data = LOAD 'midsummer.txt' as (text:CHARARRAY);
upper_case = FOREACH data GENERATE UPPER(text);
STORE upper_case INTO 'midsummerOutput';

 

New Contributor
Posts: 5
Registered: ‎02-07-2018

Re: Running Pig scripts from HUE editor - Job gets killed

I should have used the following code. This specifies the correct directory that Hadoop understands.

 

data = LOAD '/user/cloudera/midsummer.txt' as (text:CHARARRAY);
upper_case = FOREACH data GENERATE UPPER(text);
STORE upper_case INTO '/user/cloudera/midsummerOutput2';

Posts: 819
Kudos: 93
Solutions: 47
Registered: ‎04-06-2015

Re: Running Pig scripts from HUE editor - Job gets killed

Congratulations on solving your issue @SGeorge. Also, rest assured, your posts on the community do not enter a black hole from which there is no return. The community is peer to peer based forum so there are cases where questions go unanswered or take some time to receive a reply. 

 

 

Cy Jervis, Community Manager


Was your question answered? Make sure to mark it as an accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Champion
Posts: 739
Registered: ‎05-16-2016

Re: Running Pig scripts from HUE editor - Job gets killed

[ Edited ]

Just quick info you can run pig in local mode as well as in mapreduce mode ,

By default, load looks for your data on HDFS in a tab-delimited file using the default load function PigStorage. 

 also if you start you pig -x which local mode it will look for local fs . 

Nice that you found the fix. @SGeorge ,

Announcements