Support Questions

Find answers, ask questions, and share your expertise

YARN Exception from Container Launch

avatar
New Contributor

I am experiencing the following issue.

 

http://hortonworks.com/community/forums/topic/unable-to-run-distributed-shell-on-yarn/

 

2013-09-17 11:45:28,338 INFO [main] distributedshell.Client (Client.java:monitorApplication(600)) - Got application report from ASM for, appId=11, clientToAMToken=null, appDiagnostics=Application application_1379338026167_0011 failed 2 times due to AM Container for appattempt_1379338026167_0011_000002 exited with exitCode: 1 due to: Exception from container-launch:

 

The code which I am running which produces this error involves using Java map reduce to bulk load to HBase. 

 

I have tried the recommend addition of environment variables to my local user, but am still facing issues. 

 

Anyone else experience this?

1 ACCEPTED SOLUTION

avatar
New Contributor

Hi all,

 

Thanks for your support. The issue ended up being an environment issue. I needed to pass to my application the location of the Hbase configuration, via the following snippet:

 

Configuration conf = new Configuration();
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));

 

This allowed the program to determine the correct settings for connecting to HBase.

 

Thanks!

View solution in original post

3 REPLIES 3

avatar
Cloudera Employee

Hi sjarvie,

 

The next step with an error like this is to look at the logs for the container that's failing.  If you're running a MapReduce job (i.e. not distributed shell), you should be able to find these in the JobHistoryServer.  It will display a list of jobs and you can go into a job and navigate to the application master logs.  If you experience errors when trying to access these logs, make sure that yarn.log-aggregation-enable is set to true.

 

-Sandy

avatar
New Contributor

Hi Sandy,

 

I enabled yarn log aggregation and was able to access the log entry for the failed job. The first error I noticed in the log is the follow:

 

2013-12-06 15:04:00,007 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1385402645979_0033_01_000080
2013-12-06 15:04:00,007 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1385402645979_0033_01_000080
2013-12-06 15:04:00,007 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=0

 

Could this be a issue related to host configurations? 

 

The other clue I may have found is in the exit code:

 

2013-12-06 15:04:01,015 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1385402645979_0033_m_000035_0: Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143

The following link attributes this exit code to a memory allocation issue: https://github.com/ThinkinGim/atopos/issues/28 

 

 

I am going over the resource allocation configuration of my cluster now.

 

Thanks

avatar
New Contributor

Hi all,

 

Thanks for your support. The issue ended up being an environment issue. I needed to pass to my application the location of the Hbase configuration, via the following snippet:

 

Configuration conf = new Configuration();
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));

 

This allowed the program to determine the correct settings for connecting to HBase.

 

Thanks!