Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

while running spark submit on yarn getting issue

avatar
New Contributor

Log :

WARN logaggregation.LogAggregationService (LogAggregationService.java:verifyAndCreateRemoteLogDir(232)) - Remote Root Log Dir [/app-logs] already exist, but with incorrect permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple users.

found JIRA AMBARI-17633 .

Let me know how can i resolve the issue .

HDP version - Hadoop 2.7.3.2.6.1.0-129

2 REPLIES 2

avatar
Super Guru

@Sayantan Dash,

This is just a Warning message and shouldn't be the problem. Can you check if there are some other error logs.

avatar
New Contributor

I'm not able to execute the job @ Aditya sirna.


LOG:

2018-06-14 13:12:53,921 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:12:56,940 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:12:59,996 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:03,017 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:06,037 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:09,128 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:09,940 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(810)) - Start request for container_e29_1528274329273_0037_01_000001 by user elf

2018-06-14 13:13:09,942 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(850)) - Creating a new application reference for app application_1528274329273_0037

2018-06-14 13:13:09,942 INFO application.ApplicationImpl (ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037 transitioned from NEW to INITING

2018-06-14 13:13:10,021 WARN logaggregation.LogAggregationService (LogAggregationService.java:verifyAndCreateRemoteLogDir(232)) - Remote Root Log Dir [/app-logs] already exist, but with incorrect permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple users.

2018-06-14 13:13:10,130 INFO application.ApplicationImpl (ApplicationImpl.java:transition(304)) - Adding container_e29_1528274329273_0037_01_000001 to application application_1528274329273_0037

2018-06-14 13:13:10,130 INFO application.ApplicationImpl (ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037 transitioned from INITING to RUNNING

2018-06-14 13:13:10,130 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e29_1528274329273_0037_01_000001 transitioned from NEW to LOCALIZING

2018-06-14 13:13:10,130 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event CONTAINER_INIT for appId application_1528274329273_0037

2018-06-14 13:13:10,130 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(184)) - Initializing container container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:10,130 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(287)) - Initializing container container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:10,131 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip transitioned from INIT to DOWNLOADING

2018-06-14 13:13:10,131 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource file:/home/elf/.sparkStaging/application_1528274329273_0037/__spark_conf__.zip transitioned from INIT to DOWNLOADING

2018-06-14 13:13:10,131 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(712)) - Created localizer for container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:10,203 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(1194)) - Writing credentials to the nmPrivate file /hadoop/yarn/local/nmPrivate/container_e29_1528274329273_0037_01_000001.tokens. Credentials list:

2018-06-14 13:13:10,534 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:createUserCacheDirs(646)) - Initializing user elf

2018-06-14 13:13:10,535 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(126)) - Copying from /hadoop/yarn/local/nmPrivate/container_e29_1528274329273_0037_01_000001.tokens to /hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037/container_e29_1528274329273_0037_01_000001.tokens

2018-06-14 13:13:10,536 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(133)) - Localizer CWD set to /hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037 = file:/hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037

2018-06-14 13:13:10,721 WARN localizer.ResourceLocalizationService (ResourceLocalizationService.java:processHeartbeat(1017)) - { file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip, 1528962187000, ARCHIVE, null } failed: File file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip does not exist

java.io.FileNotFoundException: File file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip does not exist

at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:624)

at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:850)

at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:614)

at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:422)

at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)

at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)

at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)

at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)

at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

2018-06-14 13:13:10,721 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip(->/hadoop/yarn/local/usercache/elf/filecache/0/14727/__spark_libs__1218228472113576092.zip) transitioned from DOWNLOADING to FAILED

2018-06-14 13:13:10,721 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e29_1528274329273_0037_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED

2018-06-14 13:13:10,723 INFO localizer.LocalResourcesTrackerImpl (LocalResourcesTrackerImpl.java:handle(165)) - Container container_e29_1528274329273_0037_01_000001 sent RELEASE event on a resource request { file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip, 1528962187000, ARCHIVE, null } not present in cache.

2018-06-14 13:13:10,723 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:processHeartbeat(675)) - Unknown localizer with localizerId container_e29_1528274329273_0037_01_000001 is sending heartbeat. Ordering it to DIE

2018-06-14 13:13:10,723 WARN ipc.Client (Client.java:call(1462)) - interrupted waiting to send rpc request to server

java.lang.InterruptedException

at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)

at java.util.concurrent.FutureTask.get(FutureTask.java:191)

at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094)

at org.apache.hadoop.ipc.Client.call(Client.java:1457)

at org.apache.hadoop.ipc.Client.call(Client.java:1398)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)

at com.sun.proxy.$Proxy90.heartbeat(Unknown Source)

at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:257)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:174)

at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:139)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1114)

2018-06-14 13:13:10,724 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e29_1528274329273_0037_01_000001 transitioned from LOCALIZATION_FAILED to DONE

2018-06-14 13:13:10,725 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(1134)) - Localizer failed

java.io.IOException: java.io.IOException: java.lang.InterruptedException

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:177)

at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:139)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1114)

Caused by: java.io.IOException: java.lang.InterruptedException

at org.apache.hadoop.ipc.Client.call(Client.java:1463)

at org.apache.hadoop.ipc.Client.call(Client.java:1398)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)

at com.sun.proxy.$Proxy90.heartbeat(Unknown Source)

at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:257)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:174)

... 2 more

Caused by: java.lang.InterruptedException

at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)

at java.util.concurrent.FutureTask.get(FutureTask.java:191)

at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094)

at org.apache.hadoop.ipc.Client.call(Client.java:1457)

... 8 more

2018-06-14 13:13:10,725 WARN event.AsyncDispatcher (AsyncDispatcher.java:handle(254)) - AsyncDispatcher thread interrupted

java.lang.InterruptedException

at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)

at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)

at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)

at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:251)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1138)

2018-06-14 13:13:10,725 ERROR yarn.YarnUncaughtExceptionHandler (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread Thread[LocalizerRunner for container_e29_1528274329273_0037_01_000001,5,main] threw an Exception.

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException

at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:259)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1138)

Caused by: java.lang.InterruptedException

at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)

at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)

at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)

at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:251)

... 1 more

2018-06-14 13:13:10,726 INFO application.ApplicationImpl (ApplicationImpl.java:transition(347)) - Removing container_e29_1528274329273_0037_01_000001 from application application_1528274329273_0037

2018-06-14 13:13:10,727 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:startContainerLogAggregation(512)) - Considering container container_e29_1528274329273_0037_01_000001 for log-aggregation

2018-06-14 13:13:10,727 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event CONTAINER_STOP for appId application_1528274329273_0037

2018-06-14 13:13:10,727 INFO yarn.YarnShuffleService (YarnShuffleService.java:stopContainer(190)) - Stopping container container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:10,727 INFO yarn.YarnShuffleService (YarnShuffleService.java:stopContainer(293)) - Stopping container container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:11,744 INFO nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(553)) - Removed completed containers from NM context: [container_e29_1528274329273_0037_01_000001]

2018-06-14 13:13:11,748 INFO application.ApplicationImpl (ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP

2018-06-14 13:13:11,748 INFO containermanager.AuxServices (AuxServices.java:handle(215)) - Got event APPLICATION_STOP for appId application_1528274329273_0037

2018-06-14 13:13:11,748 INFO yarn.YarnShuffleService (YarnShuffleService.java:stopApplication(171)) - Stopping application application_1528274329273_0037

2018-06-14 13:13:11,748 INFO shuffle.ExternalShuffleBlockResolver (ExternalShuffleBlockResolver.java:applicationRemoved(206)) - Application application_1528274329273_0037 removed, cleanupLocalDirs = false

2018-06-14 13:13:11,749 INFO yarn.YarnShuffleService (YarnShuffleService.java:stopApplication(266)) - Stopping application application_1528274329273_0037

2018-06-14 13:13:11,749 INFO shuffle.ExternalShuffleBlockResolver (ExternalShuffleBlockResolver.java:applicationRemoved(186)) - Application application_1528274329273_0037 removed, cleanupLocalDirs = false

2018-06-14 13:13:11,749 INFO application.ApplicationImpl (ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED

2018-06-14 13:13:11,749 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:finishLogAggregation(520)) - Application just finished : application_1528274329273_0037

2018-06-14 13:13:11,750 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(494)) - Deleting absolute path : /hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037

2018-06-14 13:13:11,906 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:doContainerLogAggregation(567)) - Uploading logs for container container_e29_1528274329273_0037_01_000001. Current good log dirs are /hadoop/yarn/log

2018-06-14 13:13:12,138 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(425)) - Stopping resource-monitoring for container_e29_1528274329273_0037_01_000001

2018-06-14 13:13:12,141 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(503)) - Deleting path : /hadoop/yarn/log/application_1528274329273_0037

2018-06-14 13:13:12,172 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:15,202 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:18,251 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:21,355 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:24,373 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:27,393 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:30,417 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:33,439 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:36,498 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:39,525 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used

2018-06-14 13:13:42,561 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used