Support Questions

Find answers, ask questions, and share your expertise

PIG script does not work from HUE - YARN is pointing to MapReduce JobTrackers on 50030 after upgrade

avatar
Rising Star

When I launch a simple MapReduce Pig script from Hue that requires data from HDFS, I receive an error that there is no such user as admin.

 

I recently upgraded from CDH 4.7 to CDH 5.1.0. I am using CM 5.0 to manage the cluster. I am using HDFS, Hue 3.6.0, and YARN with MRv2. The script simply reads from a file and cross joins with another file. The script worked on CDH 4.7, but fails after the upgrade to CDH 5.1.

 

I found no logs in Hue that were helpful, but in the YARN Resource Manager I found a very useful log:

 

2014-08-13 13:24:37,322 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1405638744143_0028 State change from NEW_SAVING to SUBMITTED
2014-08-13 13:24:37,379 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user admin
org.apache.hadoop.util.Shell$ExitCodeException: id: admin: No such user

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
	at org.apache.hadoop.util.Shell.run(Shell.java:424)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:745)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:728)
	at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
	at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
	at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
	at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1415)
	at org.apache.hadoop.security.authorize.AccessControlList.isUserAllowed(AccessControlList.java:222)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.hasAccess(AllocationConfiguration.java:225)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:150)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:622)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1201)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585)
	at java.lang.Thread.run(Thread.java:745)
2014-08-13 13:24:37,381 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user admin
2014-08-13 13:24:37,381 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1405638744143_0028 from user: admin, in queue: default, currently num of applications: 4
1 ACCEPTED SOLUTION

avatar
Rising Star

Romain,

 

Thank you so much for your help, and for sticking with me through this problem. I have resolved the issue. There were actually two problems. After the upgrade to CDH 5, I had to stop Oozie and Install Sharelib. Finally, in YARN I had to adjust the resources. The Java Heap Size had been set to 50 MB when 8 GB of memory is available to the node (I set heap memory to 1 GB on the nodes and resource manager). I don't know why the CDH update would default to such a low number - this made YARN completely unusable. This explains why jobs would hang forever as there was not enough resources available. However, the logs did indicate this problem. 

 

I have one last question, how much memory do you give to the Java heap on the resource manager, under Java Heap Size of ResourceManager in Bytes, when the nodes are given 1 GB. I gave this 1 GB to resolve the problem, but I'm not sure if that is enough. And what about the Container sizes?

 

Thanks,

 

Kevin

View solution in original post

18 REPLIES 18

avatar
Super Guru
This is a warning, could you share all the logs?

Romain

avatar
Rising Star

Thanks for your help, the rest of YARN Resource Manager output from after the warning is as follows:

2014-08-14 16:24:35,564 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1408018429315_0002 from user: admin, in queue: default, currently num of applications: 2
2014-08-14 16:24:35,565 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1408018429315_0002 State change from SUBMITTED to ACCEPTED
2014-08-14 16:24:35,565 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1408018429315_0002_000001
2014-08-14 16:24:35,566 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1408018429315_0002_000001 State change from NEW to SUBMITTED
2014-08-14 16:24:35,568 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1408018429315_0002_000001 to scheduler from user: admin
2014-08-14 16:24:35,568 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1408018429315_0002_000001 State change from SUBMITTED to SCHEDULED

 

Then the log repeats the following text a thousand times and the job never completes:

 

2014-08-14 16:24:35,848 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,859 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,870 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,881 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000/attempts which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,898 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,916 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,928 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,940 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000/attempts/attempt_1408018429315_0001_m_000000_0 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,950 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/jobattempts which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,847 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,859 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,870 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin

 

What other logs can I look at? 

 

The Pig script is very simple:

 

offers = LOAD '/staging/file_20140727/20140727.txt' USING PigStorage AS (
fileid:CHARARRAY, offerPrice:CHARARRAY, upc:CHARARRAY, productName:CHARARRAY, productDescription:CHARARRAY);

stores = LOAD '/staging/file_20140727/20140727_stores.txt' USING PigStorage AS (storeNumber:CHARARRAY, address:CHARARRAY, city:CHARARRAY, state:CHARARRAY, zip:CHARARRAY, country:CHARARRAY);

stores2 = FILTER stores by (storeNumber == '1') OR (storeNumber == '100');

store_offers = CROSS stores2, offers;

dump store_offers;

avatar
Super Guru
Could you click on the status of the Pig job in the top right corner, it
will open its Oozie workflow, then click on the Pig action on the log icon
on the right. You should have more interesting logs!

Romain

avatar
Rising Star

Thank you Romain, that was helpful. I see something odd. In the log file, the job is unassigned, and I see an entry pointing to a JobTracker at port 50030. However, I upgraded to YARN during the CDH upgrade from 4.7 to 5. The MapReduce TaskTracker and JobTrackers were removed during this upgrade.

 

For example, in the log I see this line: 2014-08-14 16:24:35,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - More information at: http://hadoopnode05:50030/jobdetails.jsp?jobid=job_1408018429315_0002

 

hadoopnode05 is running the ResourceManager, but not a JobTracker. ResourceManager runs on port 8032. 

 

From the Oozie log, I see that the job is unassigned, but it should be a MapReduce application. Doesn't YARN take care of this?

 

User: someone
Name: PigLatin:script.pig
Application Type: MAPREDUCE
Application Tags: oozie-f8b1c706728ef633ff0dad6c4aed48a
State: ACCEPTED
FinalStatus: UNDEFINED
Started: 14-Aug-2014 16:24:35
Elapsed: 22hrs, 17mins, 59sec
Tracking URL: UNASSIGNED

 

Do you have any suggestions?

 

Kevin

avatar
Rising Star
I think I know the problem, but I don't know why. After upgrading from MR to YARN, why is YARN still trying to assign jobs to the old MR JobTracker on 50030. Didn't YARN replaced MR?
 
When I look into the YARN conf at http://yarnresourcemanager:8088/conf, I see the following entries:
 
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>0.0.0.0:50030</value>
<source>mapred-default.xml</source>
</property>
<property>
<name>mapreduce.tasktracker.http.address</name>
<value>0.0.0.0:50060</value>
<source>mapred-default.xml</source>
</property>
 
Why are these here? I thought YARN replaced MR? This explains why my Pig script tries to use MR. How can I fix this?

avatar
Super Guru
Did you update the Oozie sharelib?

Also double check that Hue is configured properly:
http://gethue.com/using-hadoop-mr2-and-yarn-with-an-alternative-job/

Romain

avatar
Super Guru
Seems similar to:

Did you update the Oozie sharelib?

Also double check that Hue is configured properly:
http://gethue.com/using-hadoop-mr2-and-yarn-with-an-alternative-job/

Romain

avatar
Rising Star

I stopped Oozie and re-installed the Oozie sharelib. And I validated that Hue is setup correctly (as indicated in the gethue link).

 

However, Pig jobs are still not completing and they are still being sent to MapReduce at 50030. In the log I see that YARN is referenced. Why is the job being sent to MapReduce and not YARN:

 

2014-08-19 09:38:02,182 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - More information at: http://servername:50030/jobdetails.jsp?jobid=job_1408403413938_0006
2014-08-19 09:38:02,264 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
Heart beat

 

The job then repeats Heart beat over an over until killed.

avatar
Super Guru
Could you share the 'Configuration' tab of the workflow in the dashboard?

Also double check that the sharelib is the Yarn one:
Step 3:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/c...


Romain