Member since
05-19-2016
216
Posts
20
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4195 | 05-29-2018 11:56 PM | |
7032 | 07-06-2017 02:50 AM | |
3770 | 10-09-2016 12:51 AM | |
3541 | 05-13-2016 04:17 AM |
05-24-2018
06:07 PM
@Harsh J : Could you please respond? It's a production cluster and it is disturbing our workflows when we run into this error
... View more
05-24-2018
07:27 AM
I have. It gives me no information apart from error code:
Log Upload Time: Thu May 24 18:56:02 +0530 2018
Log Length: 34607
2018-05-24 18:54:27,921 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1527163745858_0517_000001
2018-05-24 18:54:29,584 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2018-05-24 18:54:29,584 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@39d12b10)
2018-05-24 18:54:30,453 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 172.31.4.192:8032, Ident: (RM_DELEGATION_TOKEN owner=hue, renewer=oozie mr token, realUser=oozie, issueDate=1527168258855, maxDate=1527773058855, sequenceNumber=613, masterKeyId=2)
2018-05-24 18:54:33,015 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-05-24 18:54:33,042 WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2018-05-24 18:54:33,532 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.oozie.action.hadoop.OozieLauncherOutputCommitter
2018-05-24 18:54:33,534 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.oozie.action.hadoop.OozieLauncherOutputCommitter
2018-05-24 18:54:33,649 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2018-05-24 18:54:33,681 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2018-05-24 18:54:33,683 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2018-05-24 18:54:33,684 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2018-05-24 18:54:33,684 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2018-05-24 18:54:33,685 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2018-05-24 18:54:33,686 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2018-05-24 18:54:33,687 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2018-05-24 18:54:33,821 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:33,937 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:34,022 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:34,076 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled
2018-05-24 18:54:34,225 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2018-05-24 18:54:35,260 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-05-24 18:54:35,465 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2018-05-24 18:54:35,465 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2018-05-24 18:54:35,495 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1527163745858_0517 to jobTokenSecretManager
2018-05-24 18:54:35,843 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1527163745858_0517 because: not enabled;
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1527163745858_0517 = 0. Number of splits = 1
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1527163745858_0517 = 0
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from NEW to INITED
2018-05-24 18:54:35,943 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1527163745858_0517.
2018-05-24 18:54:36,091 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100
2018-05-24 18:54:36,176 INFO [Socket Reader #1 for port 43856] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 43856
2018-05-24 18:54:36,266 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2018-05-24 18:54:36,284 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-24 18:54:36,326 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at ip-172-31-5-201.ap-south-1.compute.internal/172.31.5.201:43856
2018-05-24 18:54:36,329 INFO [IPC Server listener on 43856] org.apache.hadoop.ipc.Server: IPC Server listener on 43856: starting
2018-05-24 18:54:36,585 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2018-05-24 18:54:36,655 INFO [main] org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2018-05-24 18:54:36,679 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2018-05-24 18:54:36,691 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2018-05-24 18:54:36,697 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2018-05-24 18:54:36,697 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2018-05-24 18:54:36,701 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2018-05-24 18:54:36,701 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2018-05-24 18:54:36,761 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 36747
2018-05-24 18:54:36,761 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2018-05-24 18:54:36,898 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/jars/hadoop-yarn-common-2.6.0-cdh5.10.1.jar!/webapps/mapreduce to ./tmp/Jetty_0_0_0_0_36747_mapreduce____.wrg2hq/webapp
2018-05-24 18:54:38,183 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:36747
2018-05-24 18:54:38,184 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 36747
2018-05-24 18:54:39,423 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2018-05-24 18:54:39,516 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 3000
2018-05-24 18:54:39,523 INFO [Socket Reader #1 for port 37627] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 37627
2018-05-24 18:54:39,578 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-24 18:54:39,588 INFO [IPC Server listener on 37627] org.apache.hadoop.ipc.Server: IPC Server listener on 37627: starting
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2018-05-24 18:54:40,565 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-172-31-4-192.ap-south-1.compute.internal/172.31.4.192:8030
2018-05-24 18:54:40,873 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:25600, vCores:8>
2018-05-24 18:54:40,873 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.users.hue
2018-05-24 18:54:40,886 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2018-05-24 18:54:40,887 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10
2018-05-24 18:54:40,959 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from INITED to SETUP
2018-05-24 18:54:40,991 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2018-05-24 18:54:41,012 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from SETUP to RUNNING
2018-05-24 18:54:41,123 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from NEW to SCHEDULED
2018-05-24 18:54:41,135 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-24 18:54:41,294 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2018-05-24 18:54:41,439 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1527163745858_0517, File: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1.jhist
2018-05-24 18:54:41,894 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:54:42,056 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:180224, vCores:35> knownNMs=6
2018-05-24 18:54:42,324 WARN [DataStreamer for file /user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1_conf.xml] org.apache.hadoop.hdfs.DFSClient: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1281)
at java.lang.Thread.join(Thread.java:1355)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:951)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:689)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:878)
2018-05-24 18:54:42,382 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:43,088 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2018-05-24 18:54:43,205 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1527163745858_0517_01_000002 to attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:43,206 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:54:43,297 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Job jar is not present. Not adding any jar to the list of resources.
2018-05-24 18:54:43,352 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /user/hue/.staging/job_1527163745858_0517/job.xml
2018-05-24 18:54:43,654 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache archive (mapreduce.job.cache.archives) hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/mysql-connector-java-5.0.8-bin.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/oozie/share/lib/lib_20170413135352/sqoop/mysql-connector-java-5.0.8-bin.jar This will be an error in Hadoop 2.0
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #1 tokens and #1 secret keys for NM use for launching container
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 2
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData
2018-05-24 18:54:45,097 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-24 18:54:45,101 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-24 18:54:45,109 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:179200, vCores:34> knownNMs=6
2018-05-24 18:54:45,146 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1527163745858_0517_01_000002 taskAttempt attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:45,149 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:45,350 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1527163745858_0517_m_000000_0 : 13562
2018-05-24 18:54:45,352 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1527163745858_0517_m_000000_0] using containerId: [container_1527163745858_0517_01_000002 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-24 18:54:45,355 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-24 18:54:45,356 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from SCHEDULED to RUNNING
2018-05-24 18:54:52,626 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:54:52,800 INFO [IPC Server handler 5 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1527163745858_0517_m_000002 asked for a task
2018-05-24 18:54:52,818 INFO [IPC Server handler 5 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1527163745858_0517_m_000002 given task: attempt_1527163745858_0517_m_000000_0
2018-05-24 18:55:08,465 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:20,711 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:20,793 INFO [IPC Server handler 3 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_0 is : 1.0
2018-05-24 18:55:30,659 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1527163745858_0517_01_000002
2018-05-24 18:55:30,660 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:30,660 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1527163745858_0517_m_000000_0: Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
2018-05-24 18:55:30,662 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from RUNNING to FAILED
2018-05-24 18:55:30,711 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1527163745858_0517_01_000002 taskAttempt attempt_1527163745858_0517_m_000000_0
2018-05-24 18:55:30,719 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-24 18:55:30,725 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node ip-172-31-1-207.ap-south-1.compute.internal
2018-05-24 18:55:30,726 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_1527163745858_0517_m_000000_1 to list of failed maps
2018-05-24 18:55:31,660 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:31,688 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:71168, vCores:40> knownNMs=6
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigning container Container: [ContainerId: container_1527163745858_0517_01_000003, NodeId: ip-172-31-1-207.ap-south-1.compute.internal:8041, NodeHttpAddress: ip-172-31-1-207.ap-south-1.compute.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 5, Token: Token { kind: ContainerToken, service: 172.31.1.207:8041 }, ] to fast fail map
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned from earlierFailedMaps
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1527163745858_0517_01_000003 to attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:32,710 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-24 18:55:32,710 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-24 18:55:32,725 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1527163745858_0517_01_000003 taskAttempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,725 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,991 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1527163745858_0517_m_000000_1 : 13562
2018-05-24 18:55:32,993 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1527163745858_0517_m_000000_1] using containerId: [container_1527163745858_0517_01_000003 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-24 18:55:32,993 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-24 18:55:33,715 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:74240, vCores:41> knownNMs=6
2018-05-24 18:55:39,936 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:40,020 INFO [IPC Server handler 9 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1527163745858_0517_m_000003 asked for a task
2018-05-24 18:55:40,021 INFO [IPC Server handler 9 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1527163745858_0517_m_000003 given task: attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,261 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:51,314 INFO [IPC Server handler 22 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_1 is : 0.0
2018-05-24 18:55:51,472 INFO [IPC Server handler 25 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_1 is : 1.0
2018-05-24 18:55:51,562 INFO [IPC Server handler 23 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,595 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from RUNNING to SUCCESS_FINISHING_CONTAINER
2018-05-24 18:55:51,595 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,603 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2018-05-24 18:55:51,604 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2018-05-24 18:55:51,604 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from RUNNING to COMMITTING
2018-05-24 18:55:51,621 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_COMMIT
2018-05-24 18:55:51,859 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for JobFinishedEvent
2018-05-24 18:55:51,860 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from COMMITTING to SUCCEEDED
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: We are finishing cleanly so this is the last retry
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator isAMLastRetry: true
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator notified that shouldUnregistered is: true
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: true
2018-05-24 18:55:51,884 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: JobHistoryEventHandler notified that forceJobCompletion is true
2018-05-24 18:55:51,884 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the services
2018-05-24 18:55:51,892 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 0
2018-05-24 18:55:51,968 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:52,261 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1.jhist to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp
2018-05-24 18:55:52,647 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp
2018-05-24 18:55:52,677 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1_conf.xml to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp
2018-05-24 18:55:52,980 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1527163745858_0517_01_000003
2018-05-24 18:55:52,981 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:52,981 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1527163745858_0517_m_000000_1:
2018-05-24 18:55:52,982 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCEEDED
2018-05-24 18:55:52,998 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1527163745858_0517_01_000003 taskAttempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:53,034 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp
2018-05-24 18:55:53,060 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517.summary_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517.summary
2018-05-24 18:55:53,068 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml
2018-05-24 18:55:53,077 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist
2018-05-24 18:55:53,109 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2018-05-24 18:55:53,210 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to
2018-05-24 18:55:53,211 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: History url is http://ip-172-31-4-192.ap-south-1.compute.internal:19888/jobhistory/job/job_1527163745858_0517
2018-05-24 18:55:53,237 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Waiting for application to be successfully unregistered.
2018-05-24 18:55:54,247 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:54,249 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020 /user/hue/.staging/job_1527163745858_0517
2018-05-24 18:55:54,278 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 37627
2018-05-24 18:55:54,386 INFO [IPC Server listener on 37627] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 37627
2018-05-24 18:55:54,390 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-05-24 18:55:54,495 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2018-05-24 18:55:54,515 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
2018-05-24 18:55:54,540 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1527163745858_0517
2018-05-24 18:55:54,550 INFO [Thread-73] org.mortbay.log: Job end notification attempts left 0
2018-05-24 18:55:54,550 INFO [Thread-73] org.mortbay.log: Job end notification trying http://ip-172-31-4-192.ap-south-1.compute.internal:11000/oozie/callback?id=0000003-180524173424234-oozie-oozi-W@sqoop-d90b&status=SUCCEEDED
2018-05-24 18:55:54,612 INFO [Thread-73] org.mortbay.log: Job end notification to http://ip-172-31-4-192.ap-south-1.compute.internal:11000/oozie/callback?id=0000003-180524173424234-oozie-oozi-W@sqoop-d90b&status=SUCCEEDED succeeded
2018-05-24 18:55:54,612 INFO [Thread-73] org.mortbay.log: Job end notification succeeded for job_1527163745858_0517
2018-05-24 18:55:59,639 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 43856
2018-05-24 18:55:59,674 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-05-24 18:55:59,682 INFO [IPC Server listener on 43856] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 43856
2018-05-24 18:55:59,712 INFO [Thread-73] org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:0
Please help me fix this. I have been trying from quite some time
... View more
05-21-2018
09:39 PM
@Harsh J: These are not spark jobs but hive and sqoop jobs I am running. These randomly get killed throughout the day, with the same configuration sometimes run and sometimes don't. mapreduce.map.memory.mb: 0GB description says: if it is specified as 0, the amount of physical memory to the request is inferred from Map Task Maximum Heap Size and Heap to Container Size Ratio. mapreduce.reduce.memory.mb: 0GB mapreduce.job.heap.memory-mb.ratio: 1GB Client Java Heap Size in Bytes: 1GB yarn.nodemanager.resource.memory-mb: 45GB yarn.scheduler.increment-allocation-mb: 1GB yarn.scheduler.maximum-allocation-mb: 4GB yarn.nodemanager.resource.cpu-vcores: 12 Number of worker nodes: 3 yarn.scheduler.maximum-allocation-vcores: 2 I am using AWS m4.4x instances for worker nodes. I have tried tweaking these values but am I doing something horribly incorrect? Please suggest
... View more
05-21-2018
07:41 PM
My cluster memory is 147GB and I get this error when the server has not used it's entire memory. I can see there is memory free and yet my jobs get killed with this error. There is no error in logs and I don't get any error using dmesg command or in /var/log/messages Also, it happens randomly and on any of the nodes. Please suggest. Been trying to get in touch with Cloudera sales support but no luck and it's urgent.
... View more
Labels:
- Labels:
-
Apache YARN
05-19-2018
12:17 AM
DatanodeRegistration(172.31.4.192, datanodeUuid=5d7a5533-df53-454e-bfb3-2dfcdbfb7b1b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=cluster2;nsid=725965767;c=0):Got exception while serving BP-1423177047-172.31.4.192-1492091038346:blk_1118810958_45073263 to /172.31.10.74:44406
org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replica not found for BP-1423177047-172.31.4.192-1492091038346:blk_1118810958_45073263
at org.apache.hadoop.hdfs.server.datanode.BlockSender.getReplica(BlockSender.java:466)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:241)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:537)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
Although my datanodes are working fine I see this error in Diagnostics. Could this be a problem?
... View more
Labels:
- Labels:
-
Cloudera Manager
05-18-2018
07:51 AM
All hive jobs started to fail with this error: 2018-05-18 08:50:58,682 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1526609705906_0273_000001
2018-05-18 08:50:59,959 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-05-18 08:51:00,073 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
2018-05-18 08:51:00,075 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
2018-05-18 08:51:00,123 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2018-05-18 08:51:00,126 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2018-05-18 08:51:00,132 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2018-05-18 08:51:00,169 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-18 08:51:00,194 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-18 08:51:00,212 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-18 08:51:00,223 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled
2018-05-18 08:51:00,277 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2018-05-18 08:51:00,537 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-05-18 08:51:00,605 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2018-05-18 08:51:00,605 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2018-05-18 08:51:00,625 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1526609705906_0273 to jobTokenSecretManager
2018-05-18 08:51:00,825 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1526609705906_0273 because: not enabled; too many reduces; too much input;
2018-05-18 08:51:00,857 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1526609705906_0273 = 1155935596. Number of splits = 4
2018-05-18 08:51:00,861 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1526609705906_0273 = 18
2018-05-18 08:51:00,861 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1526609705906_0273Job Transitioned from NEW to INITED
2018-05-18 08:51:00,863 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1526609705906_0273.
2018-05-18 08:51:00,901 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100
2018-05-18 08:51:00,912 INFO [Socket Reader #1 for port 42741] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 42741
2018-05-18 08:51:00,936 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2018-05-18 08:51:00,936 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-18 08:51:00,937 INFO [IPC Server listener on 42741] org.apache.hadoop.ipc.Server: IPC Server listener on 42741: starting
2018-05-18 08:51:00,938 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at ip-172-31-13-118.ap-south-1.compute.internal/172.31.13.118:42741
2018-05-18 08:51:01,021 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2018-05-18 08:51:01,031 INFO [main] org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2018-05-18 08:51:01,036 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2018-05-18 08:51:01,044 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2018-05-18 08:51:01,050 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2018-05-18 08:51:01,050 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2018-05-18 08:51:01,053 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2018-05-18 08:51:01,053 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2018-05-18 08:51:01,061 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 36178
2018-05-18 08:51:01,061 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2018-05-18 08:51:01,088 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/jars/hadoop-yarn-common-2.6.0-cdh5.10.1.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_36178_mapreduce____davbey/webapp
2018-05-18 08:51:01,355 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:36178
2018-05-18 08:51:01,355 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 36178
2018-05-18 08:51:01,697 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2018-05-18 08:51:01,701 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1526609705906_0273
2018-05-18 08:51:01,704 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 3000
2018-05-18 08:51:01,704 INFO [Socket Reader #1 for port 34196] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 34196
2018-05-18 08:51:01,712 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-18 08:51:01,712 INFO [IPC Server listener on 34196] org.apache.hadoop.ipc.Server: IPC Server listener on 34196: starting
2018-05-18 08:51:01,732 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2018-05-18 08:51:01,732 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2018-05-18 08:51:01,732 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2018-05-18 08:51:01,795 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-172-31-4-192.ap-south-1.compute.internal/172.31.4.192:8030
2018-05-18 08:51:01,852 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:8192, vCores:6>
2018-05-18 08:51:01,852 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.users.hue
2018-05-18 08:51:01,855 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2018-05-18 08:51:01,855 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10
2018-05-18 08:51:01,864 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1526609705906_0273Job Transitioned from INITED to SETUP
2018-05-18 08:51:01,865 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2018-05-18 08:51:01,867 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1526609705906_0273Job Transitioned from SETUP to RUNNING
2018-05-18 08:51:01,936 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1526609705906_0273, File: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1526609705906_0273/job_1526609705906_0273_1.jhist
2018-05-18 08:51:01,947 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000000 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:01,983 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000001 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,010 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000002 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,037 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000003 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,038 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000000 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,038 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000001 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,039 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000002 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,039 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000003 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,040 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000004 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,048 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000015 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,048 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000016 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,049 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_r_000017 Task Transitioned from NEW to SCHEDULED
2018-05-18 08:51:02,051 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,051 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,051 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,051 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,051 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_r_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,054 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_r_000014_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,054 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_r_000016_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,054 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_r_000017_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:02,056 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2018-05-18 08:51:02,073 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: reduceResourceRequest:<memory:1024, vCores:1>
2018-05-18 08:51:02,448 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-18 08:51:02,854 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:18 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2018-05-18 08:51:02,892 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1526609705906_0273: ask=6 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:101888, vCores:54> knownNMs=6
2018-05-18 08:51:02,892 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:101888, vCores:54>
2018-05-18 08:51:02,892 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 4
2018-05-18 08:51:02,996 INFO [IPC Server handler 0 on 42741] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task report for MAP job_1526609705906_0273. Report-size will be 4
2018-05-18 08:51:03,051 INFO [IPC Server handler 0 on 42741] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task report for REDUCE job_1526609705906_0273. Report-size will be 18
2018-05-18 08:51:03,904 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 4
2018-05-18 08:51:03,906 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1526609705906_0273_01_000002 to attempt_1526609705906_0273_m_000001_0
2018-05-18 08:51:03,907 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1526609705906_0273_01_000003 to attempt_1526609705906_0273_m_000000_0
2018-05-18 08:51:03,908 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1526609705906_0273_01_000004 to attempt_1526609705906_0273_m_000002_0
2018-05-18 08:51:03,908 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1526609705906_0273_01_000005 to attempt_1526609705906_0273_m_000003_0
2018-05-18 08:51:03,908 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:97792, vCores:50>
2018-05-18 08:51:03,908 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 4
2018-05-18 08:51:03,908 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:18 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:4 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:4 ContRel:0 HostLocal:1 RackLocal:3
2018-05-18 08:51:03,974 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1526609705906_0273/job.jar
2018-05-18 08:51:03,976 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /user/hue/.staging/job_1526609705906_0273/job.xml
2018-05-18 08:51:03,985 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens and #1 secret keys for NM use for launching container
2018-05-18 08:51:03,985 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 1
2018-05-18 08:51:03,985 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData
2018-05-18 08:51:04,022 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-18 08:51:04,026 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000001_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-18 08:51:04,029 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-18 08:51:04,029 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-18 08:51:04,030 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-18 08:51:04,030 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000002_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-18 08:51:04,031 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-18 08:51:04,032 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000003_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-18 08:51:04,034 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1526609705906_0273_01_000003 taskAttempt attempt_1526609705906_0273_m_000000_0
2018-05-18 08:51:04,034 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1526609705906_0273_01_000002 taskAttempt attempt_1526609705906_0273_m_000001_0
2018-05-18 08:51:04,034 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1526609705906_0273_01_000004 taskAttempt attempt_1526609705906_0273_m_000002_0
2018-05-18 08:51:04,039 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1526609705906_0273_01_000005 taskAttempt attempt_1526609705906_0273_m_000003_0
2018-05-18 08:51:04,042 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1526609705906_0273_m_000000_0
2018-05-18 08:51:04,042 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1526609705906_0273_m_000001_0
2018-05-18 08:51:04,042 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1526609705906_0273_m_000002_0
2018-05-18 08:51:04,042 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1526609705906_0273_m_000003_0
2018-05-18 08:51:04,168 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1526609705906_0273_m_000002_0 : 13562
2018-05-18 08:51:04,168 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1526609705906_0273_m_000001_0 : 13562
2018-05-18 08:51:04,170 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1526609705906_0273_m_000002_0] using containerId: [container_1526609705906_0273_01_000004 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-18 08:51:04,173 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000002_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-18 08:51:04,174 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1526609705906_0273_m_000001_0] using containerId: [container_1526609705906_0273_01_000002 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-18 08:51:04,174 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1526609705906_0273_m_000003_0 : 13562
2018-05-18 08:51:04,174 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000001_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-18 08:51:04,175 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000002 Task Transitioned from SCHEDULED to RUNNING
2018-05-18 08:51:04,175 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1526609705906_0273_m_000003_0] using containerId: [container_1526609705906_0273_01_000005 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-18 08:51:04,175 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000003_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-18 08:51:04,176 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000001 Task Transitioned from SCHEDULED to RUNNING
2018-05-18 08:51:04,176 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000003 Task Transitioned from SCHEDULED to RUNNING
2018-05-18 08:51:04,182 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1526609705906_0273_m_000000_0 : 13562
2018-05-18 08:51:04,183 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1526609705906_0273_m_000000_0] using containerId: [container_1526609705906_0273_01_000003 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-18 08:51:04,184 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-18 08:51:04,185 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1526609705906_0273_m_000000 Task Transitioned from SCHEDULED to RUNNING
2018-05-18 08:51:04,911 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1526609705906_0273: ask=6 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:97792, vCores:49> knownNMs=6
2018-05-18 08:51:04,911 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:97792, vCores:49>
2018-05-18 08:51:04,911 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 4
2018-05-18 08:51:07,918 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:62806, vCores:48>
2018-05-18 08:51:07,918 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 4
2018-05-18 08:51:08,925 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1526609705906_0273_01_000004
2018-05-18 08:51:08,926 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1526609705906_0273_01_000003
2018-05-18 08:51:08,926 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:64854, vCores:49>
2018-05-18 08:51:08,926 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1526609705906_0273_m_000002_0: Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
2018-05-18 08:51:08,926 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 4
2018-05-18 08:51:08,926 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:18 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:4 ContRel:0 HostLocal:1 RackLocal:3
2018-05-18 08:51:08,932 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000002_0 TaskAttempt Transitioned from RUNNING to FAILED
2018-05-18 08:51:08,932 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1526609705906_0273_m_000000_0: Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
2018-05-18 08:51:08,932 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000000_0 TaskAttempt Transitioned from RUNNING to FAILED
2018-05-18 08:51:08,933 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1526609705906_0273_01_000004 taskAttempt attempt_1526609705906_0273_m_000002_0
2018-05-18 08:51:08,943 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node ip-172-31-1-207.ap-south-1.compute.internal
2018-05-18 08:51:08,944 INFO [ContainerLauncher #5] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1526609705906_0273_01_000003 taskAttempt attempt_1526609705906_0273_m_000000_0
2018-05-18 08:51:08,945 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000002_1 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:08,945 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node ip-172-31-1-207.ap-south-1.compute.internal
2018-05-18 08:51:08,945 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1526609705906_0273_m_000000_1 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-18 08:51:08,946 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_1526609705906_0273_m_000002_1 to list of failed maps
2018-05-18 08:51:08,946 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_1526609705906_0273_m_000000_1 to list of failed maps There were no changes made to the cluster before this and I started getting this out of nowhere. Event cluster restart did not fix it and it kept throwing the error for about until 2 hours later. How do I fix it? I have looked at server logs using diagnostics but have not found anything. any pointers on this would be useful Here's server logs from diagnostics bar in CM: May 18, 8:43:53.796 AM INFO org.apache.hadoop.mapreduce.Job [HiveServer2-Background-Pool: Thread-1391]: The url to track the job: http://ip-172-31-4-192.ap-south-1.compute.internal:8088/proxy/application_1526609705906_0246/ May 18, 8:43:53.797 AM INFO org.apache.hadoop.hive.ql.exec.Task [HiveServer2-Background-Pool: Thread-1391]: Starting Job = job_1526609705906_0246, Tracking URL = http://ip-172-31-4-192.ap-south-1.compute.internal:8088/proxy/application_1526609705906_0246/ May 18, 8:43:53.797 AM INFO org.apache.hadoop.hive.ql.exec.Task [HiveServer2-Background-Pool: Thread-1391]: Kill Command = /opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/lib/hadoop/bin/hadoop job -kill job_1526609705906_0246 May 18, 8:44:00.027 AM INFO org.apache.hadoop.hive.ql.exec.Task [HiveServer2-Background-Pool: Thread-1391]: Hadoop job information for Stage-2: number of mappers: 4; number of reducers: 1 May 18, 8:44:00.063 AM WARN mapreduce.Counters [HiveServer2-Background-Pool: Thread-1391]: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead May 18, 8:44:00.063 AM INFO org.apache.hadoop.hive.ql.exec.Task [HiveServer2-Background-Pool: Thread-1391]: 2018-05-18 08:44:00,063 Stage-2 map = 0%, reduce = 0% May 18, 8:44:07.404 AM WARN mapreduce.Counters [HiveServer2-Background-Pool: Thread-1391]: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead May 18, 8:44:07.406 AM ERROR org.apache.hadoop.hive.ql.exec.Task [HiveServer2-Background-Pool: Thread-1391]: Ended Job = job_1526609705906_0246 with errors May 18, 8:44:07.407 AM ERROR org.apache.hadoop.hive.ql.exec.Task [Thread-1631]: Error during job, obtaining debugging information... May 18, 8:44:07.416 AM INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [HiveServer2-Background-Pool: Thread-1391]: Killed application application_1526609705906_0246 May 18, 8:44:07.418 AM ERROR org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask May 18, 8:44:07.418 AM INFO org.apache.hadoop.hive.ql.log.PerfLogger [HiveServer2-Background-Pool: Thread-1391]: </PERFLOG method=Driver.execute start=1526613040763 end=1526613247418 duration=206655 from=org.apache.hadoop.hive.ql.Driver> May 18, 8:44:07.418 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: MapReduce Jobs Launched: May 18, 8:44:07.418 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.09 sec HDFS Read: 238035 HDFS Write: 1380 SUCCESS May 18, 8:44:07.418 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Stage-Stage-4: Map: 1 Reduce: 1 Cumulative CPU: 4.07 sec HDFS Read: 238035 HDFS Write: 1291 SUCCESS May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Stage-Stage-5: Map: 1 Reduce: 1 Cumulative CPU: 3.82 sec HDFS Read: 238003 HDFS Write: 1209 SUCCESS May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Stage-Stage-6: Map: 1 Reduce: 1 Cumulative CPU: 3.64 sec HDFS Read: 237973 HDFS Write: 1293 SUCCESS May 18, 8:44:07.419 AM WARN mapreduce.Counters [HiveServer2-Background-Pool: Thread-1391]: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Stage-Stage-2: Map: 4 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Total MapReduce CPU Time Spent: 15 seconds 620 msec May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.Driver [HiveServer2-Background-Pool: Thread-1391]: Completed executing command(queryId=hive_20180518084040_4308c812-08c3-41aa-b5ea-04fad833b98f); Time taken: 206.655 seconds May 18, 8:44:07.419 AM INFO org.apache.hadoop.hive.ql.log.PerfLogger [HiveServer2-Background-Pool: Thread-1391]: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> May 18, 8:44:07.425 AM INFO org.apache.hadoop.hive.ql.log.PerfLogger [HiveServer2-Background-Pool: Thread-1391]: </PERFLOG method=releaseLocks start=1526613247419 end=1526613247425 duration=6 from=org.apache.hadoop.hive.ql.Driver> May 18, 8:44:07.431 AM ERROR org.apache.hive.service.cli.operation.Operation [HiveServer2-Background-Pool: Thread-1391]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) May 18, 8:44:07.433 AM INFO org.apache.hive.service.cli.thrift.ThriftCLIService [HiveServer2-Handler-Pool: Thread-575]: Session disconnected without closing properly, close it now
... View more
Labels:
- Labels:
-
Apache Hive
-
MapReduce
05-17-2018
11:45 PM
Hello, Thank you for your response. I have set the timeout to the recommended value as directed in the response. However, I still get ERROR : Failed to monitor Job[ 8] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.TimeoutException
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:153)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getState(RemoteSparkJobStatus.java:82)
at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:80)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:109)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:150)
... 23 more
ERROR : Failed to monitor Job[ 8] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.TimeoutException
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:153)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getState(RemoteSparkJobStatus.java:82)
at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:80)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:109)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:150)
... 23 more
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
INFO : Completed executing command(queryId=hive_20180518120808_12c3175f-865c-47d4-9748-29853259b4aa); Time taken: 120.892 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask (state=08S01,code=1)
Closing: 0: jdbc:hive2://ip-172-31-4-192.ap-south-1.compute.internal:10000/default
Intercepting System.exit(2)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.Hive2Main], exit code [2]
... View more
05-06-2018
11:26 AM
I have my Hive jobs scheduled through oozie and I randomly get: ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:99)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:124)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:96)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:68)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:108)
... 27 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at org.apache.hive.spark.client.rpc.RpcServer$2.run(RpcServer.java:141)
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:99)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:124)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:96)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:68)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:108)
... 27 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
at org.apache.hive.spark.client.rpc.RpcServer$2.run(RpcServer.java:141)
at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
INFO : Completed executing command(queryId=hive_20180506233232_2a9bc92f-d830-435b-91df-eafd2236a9fc); Time taken: 90.051 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask (state=08S01,code=1)
Closing: 0: jdbc:hive2://ip-172-31-4-192.ap-south-1.compute.internal:10000/default
Intercepting System.exit(2)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.Hive2Main], exit code [2] What can I do to fix this? It happens randomly even at times when workload is not much Cluster Confi: Total vCores: 60 Total memory: 205 GB 3 worker nodes in my cluster are m4.4x Large ec2 instances, 2 m4.2x large ec2 instance with Yarn (Node manager) Container memory is set to 8GB in Yarn config spark.executor.cores=1(since I will have multiple users running queries and also multiple queries running at same time) spark java heap size = 5GB Memory overhead = 2GB Is there something I can do to improve the situation?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
-
Apache YARN
07-06-2017
02:50 AM
I have increased the heap size. It was set to default of 256 MB which I guess was causing the problem. I will revert if it keeps working alright 🙂 Thanks much for your response .It helped
... View more