Created on 08-13-2014 01:32 PM - edited 09-16-2022 02:04 AM
When I launch a simple MapReduce Pig script from Hue that requires data from HDFS, I receive an error that there is no such user as admin.
I recently upgraded from CDH 4.7 to CDH 5.1.0. I am using CM 5.0 to manage the cluster. I am using HDFS, Hue 3.6.0, and YARN with MRv2. The script simply reads from a file and cross joins with another file. The script worked on CDH 4.7, but fails after the upgrade to CDH 5.1.
I found no logs in Hue that were helpful, but in the YARN Resource Manager I found a very useful log:
2014-08-13 13:24:37,322 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1405638744143_0028 State change from NEW_SAVING to SUBMITTED 2014-08-13 13:24:37,379 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user admin org.apache.hadoop.util.Shell$ExitCodeException: id: admin: No such user at org.apache.hadoop.util.Shell.runCommand(Shell.java:511) at org.apache.hadoop.util.Shell.run(Shell.java:424) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656) at org.apache.hadoop.util.Shell.execCommand(Shell.java:745) at org.apache.hadoop.util.Shell.execCommand(Shell.java:728) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84) at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52) at org.apache.hadoop.security.Groups.getGroups(Groups.java:139) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1415) at org.apache.hadoop.security.authorize.AccessControlList.isUserAllowed(AccessControlList.java:222) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.hasAccess(AllocationConfiguration.java:225) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:622) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1201) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585) at java.lang.Thread.run(Thread.java:745) 2014-08-13 13:24:37,381 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user admin 2014-08-13 13:24:37,381 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1405638744143_0028 from user: admin, in queue: default, currently num of applications: 4
Created 08-20-2014 09:29 AM
Romain,
Thank you so much for your help, and for sticking with me through this problem. I have resolved the issue. There were actually two problems. After the upgrade to CDH 5, I had to stop Oozie and Install Sharelib. Finally, in YARN I had to adjust the resources. The Java Heap Size had been set to 50 MB when 8 GB of memory is available to the node (I set heap memory to 1 GB on the nodes and resource manager). I don't know why the CDH update would default to such a low number - this made YARN completely unusable. This explains why jobs would hang forever as there was not enough resources available. However, the logs did indicate this problem.
I have one last question, how much memory do you give to the Java heap on the resource manager, under Java Heap Size of ResourceManager in Bytes, when the nodes are given 1 GB. I gave this 1 GB to resolve the problem, but I'm not sure if that is enough. And what about the Container sizes?
Thanks,
Kevin
Created 08-14-2014 08:50 AM
Created 08-14-2014 04:37 PM
Thanks for your help, the rest of YARN Resource Manager output from after the warning is as follows:
2014-08-14 16:24:35,564 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1408018429315_0002 from user: admin, in queue: default, currently num of applications: 2
2014-08-14 16:24:35,565 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1408018429315_0002 State change from SUBMITTED to ACCEPTED
2014-08-14 16:24:35,565 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1408018429315_0002_000001
2014-08-14 16:24:35,566 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1408018429315_0002_000001 State change from NEW to SUBMITTED
2014-08-14 16:24:35,568 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1408018429315_0002_000001 to scheduler from user: admin
2014-08-14 16:24:35,568 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1408018429315_0002_000001 State change from SUBMITTED to SCHEDULED
Then the log repeats the following text a thousand times and the job never completes:
2014-08-14 16:24:35,848 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,859 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,870 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,881 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000/attempts which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,898 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,916 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,928 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,940 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks/task_1408018429315_0001_m_000000/attempts/attempt_1408018429315_0001_m_000000_0 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:35,950 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/jobattempts which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,847 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001 which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,859 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
2014-08-14 16:24:36,870 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://servername09:59181/ws/v1/mapreduce/jobs/job_1408018429315_0001/tasks which is the app master GUI of application_1408018429315_0001 owned by admin
What other logs can I look at?
The Pig script is very simple:
offers = LOAD '/staging/file_20140727/20140727.txt' USING PigStorage AS (
fileid:CHARARRAY, offerPrice:CHARARRAY, upc:CHARARRAY, productName:CHARARRAY, productDescription:CHARARRAY);
stores = LOAD '/staging/file_20140727/20140727_stores.txt' USING PigStorage AS (storeNumber:CHARARRAY, address:CHARARRAY, city:CHARARRAY, state:CHARARRAY, zip:CHARARRAY, country:CHARARRAY);
stores2 = FILTER stores by (storeNumber == '1') OR (storeNumber == '100');
store_offers = CROSS stores2, offers;
dump store_offers;
Created 08-14-2014 05:46 PM
Created 08-15-2014 02:48 PM
Thank you Romain, that was helpful. I see something odd. In the log file, the job is unassigned, and I see an entry pointing to a JobTracker at port 50030. However, I upgraded to YARN during the CDH upgrade from 4.7 to 5. The MapReduce TaskTracker and JobTrackers were removed during this upgrade.
For example, in the log I see this line: 2014-08-14 16:24:35,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://hadoopnode05:50030/jobdetails.jsp?jobid=job_1408018429315_0002
hadoopnode05 is running the ResourceManager, but not a JobTracker. ResourceManager runs on port 8032.
From the Oozie log, I see that the job is unassigned, but it should be a MapReduce application. Doesn't YARN take care of this?
User: someone
Name: PigLatin:script.pig
Application Type: MAPREDUCE
Application Tags: oozie-f8b1c706728ef633ff0dad6c4aed48a
State: ACCEPTED
FinalStatus: UNDEFINED
Started: 14-Aug-2014 16:24:35
Elapsed: 22hrs, 17mins, 59sec
Tracking URL: UNASSIGNED
Do you have any suggestions?
Kevin
Created 08-18-2014 02:33 PM
Created 08-18-2014 05:10 PM
Created 08-18-2014 05:37 PM
Created 08-19-2014 09:41 AM
I stopped Oozie and re-installed the Oozie sharelib. And I validated that Hue is setup correctly (as indicated in the gethue link).
However, Pig jobs are still not completing and they are still being sent to MapReduce at 50030. In the log I see that YARN is referenced. Why is the job being sent to MapReduce and not YARN:
2014-08-19 09:38:02,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://servername:50030/jobdetails.jsp?jobid=job_1408403413938_0006
2014-08-19 09:38:02,264 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
Heart beat
The job then repeats Heart beat over an over until killed.
Created 08-19-2014 09:46 AM