Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PIG script does not work from HUE - YARN is pointing to MapReduce JobTrackers on 50030 after upgrade

Solved Go to solution

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor

Thanks for your help Romain.

 

The sharelib is the one used for Yarn: oozie-sharelib-yarn.tar.gz

 

I've enclosed the configuration of the job from Oozie, but this looks like it is using Yarn. The job starts, but never finishes, instead it repeats Heart beat over and over. I see an entry in the log that refers to port 50030, which is why it looks like it is using MRv1. But I can see the job in Yarn's ResourceManager, it is RUNNING, but never finishes until killed.

 

Name Value

hue-id-w59
jobTrackerservername05:8032
mapreduce.job.user.nameadmin
nameNodehdfs://namenode02:8020
oozie.use.system.libpathtrue
oozie.wf.application.pathhdfs://namenode02:8020/user/hue/oozie/workspaces/_admin_-oozie-59-1408466201.2
user.nameadmin

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor
This might help, in the Hue Server Logs, I see the following error:

[19/Aug/2014 11:12:45 -0700] api ERROR An error happen while watching the demo running: Could not find job job_1408403413938_0008.
[19/Aug/2014 11:12:45 -0700] connectionpool DEBUG "GET /ws/v1/history/mapreduce/jobs/job_1408403413938_0008 HTTP/1.1" 404 None
[19/Aug/2014 11:12:45 -0700] connectionpool DEBUG Setting read timeout to None
Highlighted

Re: PIG script does not work from HUE - No groups available for user admin

Ha, so it is probably another problem.

What type of cluster do you have? How many nodes?

Could you look at #5?
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/

Romain

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor
We run 12 CDH 5.1 nodes managed by CM 5.0.2. We recently upgraded from CDH 4.7 to CDH 5.1 and since the update we have not been able to run a Pig script using YARN/MapReduce.

We run the following services:

Flume
HDFS 2.3.0-cdh5.1.0
HBase 0.98.1-cdh5.1.0
Hive
Hue 3.6.0
Impala
Oozie
Solr
Spark
YARN (with MRv2)
ZooKeeper

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor
Romain,

I applied the change from step #5 in the document: http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/, but unfortunately, it did not help. But this looks very similar to my problem.

I tried to narrow down the problem I'm having with running Pig scripts through Hue and YARN. Here is what I do:

1. Create a Pig Script in Hue:

offers = LOAD '/tmp/datafile.txt' USING PigStorage AS (name:CHARARRAY);

The script succeeds.

2. However, when I add a dump to the script, like this:

offers = LOAD '/tmp/datafile.txt' USING PigStorage AS (name:CHARARRAY);
dump offers;

The script never moves past 0% and repeats Heat beat over an over again. The job displays in Oozie but never goes anywhere (the job is stuck on RUNNING). This same script worked in CDH 4.7 using MRv1. I can't find much in the logs to help identify a problem, it just never finishes.

Here is an excerpt from the job's log:

2014-08-19 14:31:01,128 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://servernode05:50030/jobdetails.jsp?jobid=job_1408403413938_0014
2014-08-19 14:31:01,227 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
Heart beat
Heart beat
Heart beat
Heart beat

Re: PIG script does not work from HUE - No groups available for user admin

To check if it is an Oozie related problem (setup or last of slots for
running jobs as Oozie starts a MR launcher), did you try a basic example?
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Quick-Start/cdh5qs_y...

Romain

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor

Romain,

 

Thank you so much for your help, and for sticking with me through this problem. I have resolved the issue. There were actually two problems. After the upgrade to CDH 5, I had to stop Oozie and Install Sharelib. Finally, in YARN I had to adjust the resources. The Java Heap Size had been set to 50 MB when 8 GB of memory is available to the node (I set heap memory to 1 GB on the nodes and resource manager). I don't know why the CDH update would default to such a low number - this made YARN completely unusable. This explains why jobs would hang forever as there was not enough resources available. However, the logs did indicate this problem. 

 

I have one last question, how much memory do you give to the Java heap on the resource manager, under Java Heap Size of ResourceManager in Bytes, when the nodes are given 1 GB. I gave this 1 GB to resolve the problem, but I'm not sure if that is enough. And what about the Container sizes?

 

Thanks,

 

Kevin

Re: PIG script does not work from HUE - No groups available for user admin

New Contributor

Hi Guys,

 

I met same issue as yours. 

 

I run a simple pig such as Load 'File'; Dump data; 

 

However, the pig cannot be completed and the logs always should 0% complete.

 

I search the log and find the map progress is always running but not complete. I have adjusted the values of settings you list before. But it looks like the issue still exist.

 

Do you have other ideas on it?

 

Thanks very much

Re: PIG script does not work from HUE - No groups available for user admin

Expert Contributor

Edmund,

 

When I've seen Pig scripts show 0% complete and never finish, I've usually resolved this by adjusting Yarn. How many nodes are you running in your cluster? How much memory is available to your nodes?

 

Kevin