Created 01-22-2016 04:39 PM
I am trying to run a benchmark job, with the following command : yarn jar /path/to/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 1000 -resFile /tmp/TESTDFSio.txt but my job fails with following error messages :
16/01/22 15:08:47 INFO mapreduce.Job: Task Id : attempt_1453395961197_0017_m_000008_2, Status : FAILED Application application_1453395961197_0017 initialization failed (exitCode=255) with output: main : command provided 0
main : user is foo
main : requested yarn user is foo
Path /mnt/sdb1/yarn/local/usercache/foo/appcache/application_1453395961197_0017 has permission 700 but needs permission 750.
Path /var/hadoop/yarn/local/usercache/foo/appcache/application_1453395961197_0017 has permission 700 but needs permission 750. Did not create any app directories
Even when I change these directories permission to 750, I get errors. Also these caches dont get cleaned off, after one job'and create collisons when running the next job. Any insights ?
Created 01-22-2016 04:42 PM
@Anilkumar Panda can you run service checks for Mapreduce2, YARN and HDFS. You should restart YARN service and it will change permissions as necessary unless there are other issues, in that case, we need to check umask and mount options on your disk.
Created 01-25-2016 09:40 AM
@Artem Ervits The service checks run fine. Also we have started the services many time, the issue still persists. umask value in all nodes is set to 0022 .
What are the mount options we should check ?
Created 01-25-2016 12:36 PM
@Anilkumar Panda please paste the directory screenshots and /etc/fstab
Created 01-25-2016 12:59 PM
Try this and see if it helps.
chmod -R 750 /mnt/sdb1/yarn
chmod -R 750 /var/hadoop/yarn
Created 01-25-2016 01:25 PM
Have tried that, also the issue is, when a new folder is created, the permissions dont apply, hence the job starts failing.
Some cleaning up is not happening correctly, but I am unable to locate the issue 😞
Created 01-25-2016 01:31 PM
Let's try this
check for yarn.nodemanager.local-dirs
for user foo delete everything under usercache for the user in all data nodes.
[root@phdns02 conf]# ls -l /hadoop/yarn/local/usercache/foo/
total 8
drwxr-xr-x. 2 yarn hadoop 4096 Jan 23 15:06 appcache
drwxr-xr-x. 2 yarn hadoop 4096 Jan 23 14:01 filecache
[root@phdns02 conf]#
Created 01-25-2016 03:47 PM
Deleting the directory makes the job work for once, but afterwards it fails again.
Created 01-25-2016 04:22 PM
@Anilkumar Panda Sounds like a bug...Please open a support ticket
Created 01-25-2016 04:23 PM
@Anilkumar Panda See this ..it may ring a bell
http://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/
Created 02-03-2016 12:32 PM
@Neeraj Sabharwal I tried this option, but to sucess there yet .
Created 02-03-2016 01:34 AM
@Anilkumar Panda are you still having issues with this? Can you accept best answer or provide your workaround?
Created 02-03-2016 12:32 PM
@Artem Ervits @Neeraj SabharwalI have noticed a few conflicting settings in the Yarn site.xml.yarn.nodemanager.container-executor.class = org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutorand we dont have same linux users across the cluster. Hence waiting for the users to be created. Once that is done will test and post the result.