- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
physical memory limits
- Labels:
-
Apache YARN
-
Cloudera Manager
Created 02-16-2016 07:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
I am running one job from RStudio and y get this error:
16/02/16 13:01:30 INFO mapreduce.Job: map 100% reduce 24% 16/02/16 13:12:22 INFO mapreduce.Job: map 100% reduce 100% 16/02/16 13:12:22 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_0, Status : FAILED Container [pid=18361,containerID=container_e24_1455198426748_0476_01_000499] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.9 GB of 14.7 GB virtual memory used. Killing container. Dump of the process-tree for container_e24_1455198426748_0476_01_000499 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 18377 18361 18361 18361 (java) 3320 733 8102256640 609337 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 |- 19618 18377 18361 18361 (R) 96691 1583 5403787264 1249728 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd |- 19629 19618 18361 18361 (cat) 0 0 103407616 166 cat |- 18361 18359 18361 18361 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stderr |- 19627 19618 18361 18361 (cat) 1 48 103407616 174 cat Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 16/02/16 13:12:23 INFO mapreduce.Job: map 100% reduce 0% 16/02/16 13:12:34 INFO mapreduce.Job: map 100% reduce 15% 16/02/16 13:12:37 INFO mapreduce.Job: map 100% reduce 21% 16/02/16 13:12:40 INFO mapreduce.Job: map 100% reduce 24% 16/02/16 13:28:26 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_1, Status : FAILED Container [pid=21694,containerID=container_e24_1455198426748_0476_01_001310] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.6 GB of 14.7 GB virtual memory used. Killing container. Dump of the process-tree for container_e24_1455198426748_0476_01_001310 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 21694 21692 21694 21694 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stderr |- 21781 21704 21694 21694 (R) 93564 1394 5118803968 1185913 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd |- 21807 21781 21694 21694 (cat) 0 43 103407616 173 cat |- 21704 21694 21694 21694 (java) 2526 787 8089718784 664117 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 |- 21810 21781 21694 21694 (cat) 0 0 103407616 166 cat Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 16/02/16 13:28:27 INFO mapreduce.Job: map 100% reduce 0% 16/02/16 13:28:38 INFO mapreduce.Job: map 100% reduce 16% 16/02/16 13:28:41 INFO mapreduce.Job: map 100% reduce 20% 16/02/16 13:28:44 INFO mapreduce.Job: map 100% reduce 24% 16/02/16 13:46:02 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_2, Status : FAILED Container [pid=23643,containerID=container_e24_1455198426748_0476_01_001311] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.8 GB of 14.7 GB virtual memory used. Killing container. Dump of the process-tree for container_e24_1455198426748_0476_01_001311 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 23737 23729 23643 23643 (cat) 0 44 103407616 174 cat |- 23738 23729 23643 23643 (cat) 0 0 103407616 166 cat |- 23729 23653 23643 23643 (R) 101777 1652 5376724992 1248882 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd |- 23653 23643 23643 23643 (java) 2328 784 8079331328 617129 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 |- 23643 23641 23643 23643 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 16/02/16 13:46:03 INFO mapreduce.Job: map 100% reduce 0% 16/02/16 13:46:15 INFO mapreduce.Job: map 100% reduce 17% 16/02/16 13:46:18 INFO mapreduce.Job: map 100% reduce 22% 16/02/16 13:46:21 INFO mapreduce.Job: map 100% reduce 24% 16/02/16 13:59:00 INFO mapreduce.Job: map 100% reduce 100% 16/02/16 13:59:00 INFO mapreduce.Job: Job job_1455198426748_0476 failed with state FAILED due to: Task failed task_1455198426748_0476_r_000000 Job failed as tasks failed. failedMaps:0 failedReduces:1 16/02/16 13:59:00 INFO mapreduce.Job: Counters: 39 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=2064381938 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=13416462815 HDFS: Number of bytes written=0 HDFS: Number of read operations=321 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed reduce tasks=4 Launched map tasks=107 Launched reduce tasks=4 Data-local map tasks=107 Total time spent by all maps in occupied slots (ms)=37720330 Total time spent by all reduces in occupied slots (ms)=7956034 Total time spent by all map tasks (ms)=18860165 Total time spent by all reduce tasks (ms)=3978017 Total vcore-seconds taken by all map tasks=18860165 Total vcore-seconds taken by all reduce tasks=3978017 Total megabyte-seconds taken by all map tasks=77251235840 Total megabyte-seconds taken by all reduce tasks=28514425856 Map-Reduce Framework Map input records=99256589 Map output records=321 Map output bytes=2050220619 Map output materialized bytes=2050222738 Input split bytes=12519 Combine input records=321 Combine output records=321 Spilled Records=321 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=151580 CPU time spent (ms)=4098800 Physical memory (bytes) snapshot=256365596672 Virtual memory (bytes) snapshot=538256474112 Total committed heap usage (bytes)=286838489088 File Input Format Counters Bytes Read=13416450296 rmr reduce calls=107 16/02/16 13:59:00 ERROR streaming.StreamJob: Job not successful!
I think its for the memory but i thinks its for the R program, because the same job from pig worked well, any suggestion??
Thanks
Created 02-17-2016 05:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho did you look at my answer in your other question about adding a Combiner class and enabling compression from map stage as well as from reduce output.
mapreduce.map.output.compress mapreduce.map.output.compress.codec
Created 02-16-2016 07:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are correct. I have seen this behavior and I had to play with mapreduce.task.io.sort.mb setting
You may have to test it few times with different values.
Created 02-16-2016 07:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
i cant put more than 2047 mb, and stil doesnt work.....
Created 02-16-2016 08:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho reduce the value ..Don't increase it
Created 02-16-2016 08:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho I had to reduce it to 64mb from 1024 to make it work. Link
Created 02-16-2016 08:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok let me try and ill tell which parameter i used 🙂
Many thanks again.
Created 02-16-2016 08:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
Still in 500 mb doesnt work, but i dont understand why I need to down this memory, please could you explain me in detail??
Many thanks
Created 02-16-2016 08:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho Try different values. I know its frustrating.
Created 02-16-2016 08:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho Here is more hint for you http://stackoverflow.com/questions/29001702/why-yarn-java-heap-space-memory-error
Created 02-17-2016 04:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho mapreduce.task.io.sort.mb
I recommend to open a support case if this is critical. FYI: I had the same issue and I ended up running several jobs with different values for the above entry