Reply
Explorer
Posts: 64
Registered: ‎08-07-2017

smaller jobs also getting getting failed in the cloudera cluster

Hi,

 

When I am trying to run the job using the command line option , job is getting failed. Below are the error messags I can see in the logs. 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Container id: container_1509961177849_0155_01_000014
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
	at org.apache.hadoop.util.Shell.run(Shell.java:455)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1

 

Is it due to the JVM heap memory issue? but even small job is not working.

 

Please help me.

 

Thanks,

Priya

 

 

Posts: 176
Topics: 8
Kudos: 21
Solutions: 19
Registered: ‎07-16-2015

Re: smaller jobs also getting getting failed in the cloudera cluster

This error (exit code 143) usualy mean that the container is killed because it tried to use more memory than configured :

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

 

A missconfiguration of yarn could lead to this.

Check your configuration of containers memory and tasks memory (map & reduce).

 

regards,

Mathieu

Posts: 394
Topics: 11
Kudos: 60
Solutions: 35
Registered: ‎09-02-2016

Re: smaller jobs also getting getting failed in the cloudera cluster

@cdhhadoop

 

I've already provided some clarification for the similar issue in this link

 

http://community.cloudera.com/t5/Hadoop-101-Training-Quickstart/Map-and-Reduce-Error-Java-heap-space...

 

sharing the same info again here

 

Those who are using Hadoop 2.x, pls use the below parameters instead

 

mapreduce.map.java.opts=-Xmx4g         # Note: 4 GB

mapreduce.reduce.java.opts=-Xmx4g     # Note: 4 GB

 

Also when you set java.opts, you need to note two important points

1. It has dependency on memory.mb, so always try to set java.opts upto 80% of memory.mb

2. Follow the "-Xmx4g" format for opt but numerical value for memory.mb

 

mapreduce.map.memory.mb = 5012        #  Note: 5 GB

mapreduce.reduce.memory.mb = 5012    # Note: 5 GB

 

Finally, some organization will not allow you to alter mapred-site.xml directly or via CM. Also we need thease kind of setup only to handle very big tables, so it is not recommanded to alter the configuration only for few tables..so you can do this setup temporarly by following below steps: 

 

1. From HDFS:

HDFS> export HIVE_OPTS="-hiveconf mapreduce.map.memory.mb=5120 -hiveconf mapreduce.reduce.memory.mb=5120 -hiveconf mapreduce.map.java.opts=-Xmx4g -hiveconf mapreduce.reduce.java.opts=-Xmx4g"

2. From Hive:

hive> set mapreduce.map.memory.mb=5120;

hive> set mapreduce.reduce.memory.mb=5120;

hive> set mapreduce.map.java.opts=-Xmx4g;

hive> set mapreduce.reduce.java.opts=-Xmx4g;

 

Note: HIVE_OPTS is to handle only HIVE, if you need similar setup for HADOOP then use HADOOP_OPTS

Explorer
Posts: 64
Registered: ‎08-07-2017

Re: smaller jobs also getting getting failed in the cloudera cluster

@saranvisa.

Thanks for reply. We are using Hadoop 2.6 and mapreduce.map.memory.mb and mapreduce.reduce.memory.mb is 1 Gb resp.
Also mapreduce.map.java.opts and mapreduce.reduce.java.opts is around 800MB.

In the logs I can see below error as well.

Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1

Can you please help me ?

Thanks,
Priya
Announcements