Member since
06-15-2018
2
Posts
0
Kudos Received
0
Solutions
06-18-2018
10:23 AM
I'm not able to execute the job @ Aditya sirna. LOG: 2018-06-14 13:12:53,921 INFO monitor.ContainersMonitorImpl
(ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for
container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB
physical memory used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:12:56,940 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:12:59,996 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:03,017 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:06,037 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:09,128 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:09,940 INFO
containermanager.ContainerManagerImpl
(ContainerManagerImpl.java:startContainerInternal(810)) - Start request for
container_e29_1528274329273_0037_01_000001 by user elf 2018-06-14 13:13:09,942 INFO
containermanager.ContainerManagerImpl
(ContainerManagerImpl.java:startContainerInternal(850)) - Creating a new
application reference for app application_1528274329273_0037 2018-06-14 13:13:09,942 INFO application.ApplicationImpl
(ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037
transitioned from NEW to INITING 2018-06-14
13:13:10,021 WARN logaggregation.LogAggregationService
(LogAggregationService.java:verifyAndCreateRemoteLogDir(232)) - Remote Root Log
Dir [/app-logs] already exist, but with incorrect permissions. Expected:
[rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple
users. 2018-06-14
13:13:10,130 INFO application.ApplicationImpl
(ApplicationImpl.java:transition(304)) - Adding
container_e29_1528274329273_0037_01_000001 to application
application_1528274329273_0037 2018-06-14
13:13:10,130 INFO application.ApplicationImpl
(ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037
transitioned from INITING to RUNNING 2018-06-14
13:13:10,130 INFO container.ContainerImpl
(ContainerImpl.java:handle(1163)) - Container
container_e29_1528274329273_0037_01_000001 transitioned from NEW to LOCALIZING 2018-06-14 13:13:10,130 INFO containermanager.AuxServices
(AuxServices.java:handle(215)) - Got event CONTAINER_INIT for appId
application_1528274329273_0037 2018-06-14 13:13:10,130 INFO yarn.YarnShuffleService
(YarnShuffleService.java:initializeContainer(184)) - Initializing container
container_e29_1528274329273_0037_01_000001 2018-06-14 13:13:10,130 INFO yarn.YarnShuffleService
(YarnShuffleService.java:initializeContainer(287)) - Initializing container
container_e29_1528274329273_0037_01_000001 2018-06-14 13:13:10,131 INFO
localizer.LocalizedResource (LocalizedResource.java:handle(203)) -
Resource
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip
transitioned from INIT to DOWNLOADING 2018-06-14 13:13:10,131 INFO
localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource
file:/home/elf/.sparkStaging/application_1528274329273_0037/__spark_conf__.zip
transitioned from INIT to DOWNLOADING 2018-06-14 13:13:10,131 INFO
localizer.ResourceLocalizationService
(ResourceLocalizationService.java:handle(712)) - Created localizer for
container_e29_1528274329273_0037_01_000001 2018-06-14 13:13:10,203 INFO
localizer.ResourceLocalizationService
(ResourceLocalizationService.java:writeCredentials(1194)) - Writing credentials
to the nmPrivate file /hadoop/yarn/local/nmPrivate/container_e29_1528274329273_0037_01_000001.tokens.
Credentials list: 2018-06-14
13:13:10,534 INFO nodemanager.DefaultContainerExecutor
(DefaultContainerExecutor.java:createUserCacheDirs(646)) - Initializing user
elf 2018-06-14 13:13:10,535 INFO
nodemanager.DefaultContainerExecutor
(DefaultContainerExecutor.java:startLocalizer(126)) - Copying from
/hadoop/yarn/local/nmPrivate/container_e29_1528274329273_0037_01_000001.tokens
to /hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037/container_e29_1528274329273_0037_01_000001.tokens 2018-06-14 13:13:10,536 INFO
nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(133))
- Localizer CWD set to
/hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037 =
file:/hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037 2018-06-14
13:13:10,721 WARN localizer.ResourceLocalizationService
(ResourceLocalizationService.java:processHeartbeat(1017)) - {
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip,
1528962187000, ARCHIVE, null } failed: File
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip
does not exist java.io.FileNotFoundException: File
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip
does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:624)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:850)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:614)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:422)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) 2018-06-14
13:13:10,721 INFO localizer.LocalizedResource
(LocalizedResource.java:handle(203)) - Resource
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip(->/hadoop/yarn/local/usercache/elf/filecache/0/14727/__spark_libs__1218228472113576092.zip)
transitioned from DOWNLOADING to FAILED 2018-06-14
13:13:10,721 INFO container.ContainerImpl (ContainerImpl.java:handle(1163))
- Container container_e29_1528274329273_0037_01_000001 transitioned from
LOCALIZING to LOCALIZATION_FAILED 2018-06-14
13:13:10,723 INFO localizer.LocalResourcesTrackerImpl
(LocalResourcesTrackerImpl.java:handle(165)) - Container container_e29_1528274329273_0037_01_000001
sent RELEASE event on a resource request {
file:/tmp/spark-a57d5eb4-528a-4d71-965b-f8afd494fbf1/__spark_libs__1218228472113576092.zip,
1528962187000, ARCHIVE, null } not present in cache. 2018-06-14 13:13:10,723 INFO localizer.ResourceLocalizationService
(ResourceLocalizationService.java:processHeartbeat(675)) - Unknown localizer
with localizerId container_e29_1528274329273_0037_01_000001 is sending
heartbeat. Ordering it to DIE 2018-06-14 13:13:10,723 WARN ipc.Client (Client.java:call(1462))
- interrupted waiting to send rpc request to server java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094)
at org.apache.hadoop.ipc.Client.call(Client.java:1457)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy90.heartbeat(Unknown Source)
at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:257)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:174)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:139)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1114) 2018-06-14
13:13:10,724 INFO container.ContainerImpl
(ContainerImpl.java:handle(1163)) - Container
container_e29_1528274329273_0037_01_000001 transitioned from
LOCALIZATION_FAILED to DONE 2018-06-14
13:13:10,725 INFO localizer.ResourceLocalizationService
(ResourceLocalizationService.java:run(1134)) - Localizer failed java.io.IOException:
java.io.IOException: java.lang.InterruptedException
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:177)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:139)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1114) Caused by: java.io.IOException:
java.lang.InterruptedException
at org.apache.hadoop.ipc.Client.call(Client.java:1463)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy90.heartbeat(Unknown Source)
at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:257)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:174)
... 2 more Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094)
at org.apache.hadoop.ipc.Client.call(Client.java:1457)
... 8 more 2018-06-14 13:13:10,725 WARN event.AsyncDispatcher
(AsyncDispatcher.java:handle(254)) - AsyncDispatcher thread interrupted java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:251)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1138) 2018-06-14
13:13:10,725 ERROR yarn.YarnUncaughtExceptionHandler
(YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread
Thread[LocalizerRunner for container_e29_1528274329273_0037_01_000001,5,main]
threw an Exception. org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.lang.InterruptedException
at
org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:259)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1138) Caused by: java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:251)
... 1 more 2018-06-14 13:13:10,726 INFO
application.ApplicationImpl (ApplicationImpl.java:transition(347)) - Removing
container_e29_1528274329273_0037_01_000001 from application
application_1528274329273_0037 2018-06-14 13:13:10,727 INFO
logaggregation.AppLogAggregatorImpl
(AppLogAggregatorImpl.java:startContainerLogAggregation(512)) - Considering
container container_e29_1528274329273_0037_01_000001 for log-aggregation 2018-06-14 13:13:10,727 INFO
containermanager.AuxServices (AuxServices.java:handle(215)) - Got event
CONTAINER_STOP for appId application_1528274329273_0037 2018-06-14 13:13:10,727 INFO yarn.YarnShuffleService
(YarnShuffleService.java:stopContainer(190)) - Stopping container
container_e29_1528274329273_0037_01_000001 2018-06-14 13:13:10,727 INFO yarn.YarnShuffleService
(YarnShuffleService.java:stopContainer(293)) - Stopping container
container_e29_1528274329273_0037_01_000001 2018-06-14
13:13:11,744 INFO nodemanager.NodeStatusUpdaterImpl
(NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(553)) -
Removed completed containers from NM context:
[container_e29_1528274329273_0037_01_000001] 2018-06-14
13:13:11,748 INFO application.ApplicationImpl
(ApplicationImpl.java:handle(464)) - Application application_1528274329273_0037
transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2018-06-14
13:13:11,748 INFO containermanager.AuxServices
(AuxServices.java:handle(215)) - Got event APPLICATION_STOP for appId
application_1528274329273_0037 2018-06-14 13:13:11,748 INFO yarn.YarnShuffleService
(YarnShuffleService.java:stopApplication(171)) - Stopping application
application_1528274329273_0037 2018-06-14 13:13:11,748 INFO shuffle.ExternalShuffleBlockResolver
(ExternalShuffleBlockResolver.java:applicationRemoved(206)) - Application
application_1528274329273_0037 removed, cleanupLocalDirs = false 2018-06-14 13:13:11,749 INFO yarn.YarnShuffleService
(YarnShuffleService.java:stopApplication(266)) - Stopping application
application_1528274329273_0037 2018-06-14 13:13:11,749 INFO
shuffle.ExternalShuffleBlockResolver
(ExternalShuffleBlockResolver.java:applicationRemoved(186)) - Application
application_1528274329273_0037 removed, cleanupLocalDirs = false 2018-06-14 13:13:11,749 INFO
application.ApplicationImpl (ApplicationImpl.java:handle(464)) - Application
application_1528274329273_0037 transitioned from
APPLICATION_RESOURCES_CLEANINGUP to FINISHED 2018-06-14 13:13:11,749 INFO logaggregation.AppLogAggregatorImpl
(AppLogAggregatorImpl.java:finishLogAggregation(520)) - Application just
finished : application_1528274329273_0037 2018-06-14 13:13:11,750 INFO
nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(494))
- Deleting absolute path :
/hadoop/yarn/local/usercache/elf/appcache/application_1528274329273_0037 2018-06-14 13:13:11,906 INFO
logaggregation.AppLogAggregatorImpl
(AppLogAggregatorImpl.java:doContainerLogAggregation(567)) - Uploading logs for
container container_e29_1528274329273_0037_01_000001. Current good log dirs are
/hadoop/yarn/log 2018-06-14 13:13:12,138 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(425)) - Stopping
resource-monitoring for container_e29_1528274329273_0037_01_000001 2018-06-14 13:13:12,141 INFO
nodemanager.DefaultContainerExecutor
(DefaultContainerExecutor.java:deleteAsUser(503)) - Deleting path :
/hadoop/yarn/log/application_1528274329273_0037 2018-06-14 13:13:12,172 INFO monitor.ContainersMonitorImpl
(ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for
container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB
physical memory used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:15,202 INFO monitor.ContainersMonitorImpl
(ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 92027 for
container-id container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB
physical memory used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:18,251 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:21,355 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:24,373 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:27,393 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:30,417 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:33,439 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:36,498 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id
container_e29_1528274329273_0019_01_000001: 335.1 MB of 1 GB physical memory
used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:39,525 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001:
335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used 2018-06-14 13:13:42,561 INFO
monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory
usage of ProcessTree 92027 for container-id container_e29_1528274329273_0019_01_000001:
335.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used
... View more
06-15-2018
12:46 PM
Log : WARN logaggregation.LogAggregationService (LogAggregationService.java:verifyAndCreateRemoteLogDir(232)) - Remote Root Log Dir [/app-logs] already exist, but with incorrect permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple users. found JIRA AMBARI-17633 . Let me know how can i resolve the issue . HDP version - Hadoop 2.7.3.2.6.1.0-129
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
-
Apache YARN