12-05-2013 03:38 PM - edited 12-06-2013 11:26 AM
I am experiencing the following issue.
2013-09-17 11:45:28,338 INFO [main] distributedshell.Client (Client.java:monitorApplication(600)) - Got application report from ASM for, appId=11, clientToAMToken=null, appDiagnostics=Application application_1379338026167_0011 failed 2 times due to AM Container for appattempt_1379338026167_0011_000002 exited with exitCode: 1 due to: Exception from container-launch:
The code which I am running which produces this error involves using Java map reduce to bulk load to HBase.
I have tried the recommend addition of environment variables to my local user, but am still facing issues.
Anyone else experience this?
12-06-2013 02:32 PM
The next step with an error like this is to look at the logs for the container that's failing. If you're running a MapReduce job (i.e. not distributed shell), you should be able to find these in the JobHistoryServer. It will display a list of jobs and you can go into a job and navigate to the application master logs. If you experience errors when trying to access these logs, make sure that yarn.log-aggregation-enable is set to true.
12-09-2013 09:22 AM - edited 12-09-2013 10:35 AM
I enabled yarn log aggregation and was able to access the log entry for the failed job. The first error I noticed in the log is the follow:
2013-12-06 15:04:00,007 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1385402645979_0033_01_000080 2013-12-06 15:04:00,007 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1385402645979_0033_01_000080 2013-12-06 15:04:00,007 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=0
Could this be a issue related to host configurations?
The other clue I may have found is in the exit code:
2013-12-06 15:04:01,015 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1385402645979_0033_m_000035_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143
The following link attributes this exit code to a memory allocation issue: https://github.com/ThinkinGim/atopos/issues/28
I am going over the resource allocation configuration of my cluster now.
12-23-2013 11:37 AM
Thanks for your support. The issue ended up being an environment issue. I needed to pass to my application the location of the Hbase configuration, via the following snippet:
Configuration conf = new Configuration(); conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
This allowed the program to determine the correct settings for connecting to HBase.