Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

physical memory limits

avatar
Master Collaborator

Hi:

I am running one job from RStudio and y get this error:

16/02/16 13:01:30 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:12:22 INFO mapreduce.Job:  map 100% reduce 100%
16/02/16 13:12:22 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_0, Status : FAILED
Container [pid=18361,containerID=container_e24_1455198426748_0476_01_000499] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.9 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_000499 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 18377 18361 18361 18361 (java) 3320 733 8102256640 609337 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 
	|- 19618 18377 18361 18361 (R) 96691 1583 5403787264 1249728 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 19629 19618 18361 18361 (cat) 0 0 103407616 166 cat 
	|- 18361 18359 18361 18361 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stderr  
	|- 19627 19618 18361 18361 (cat) 1 48 103407616 174 cat 


Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


16/02/16 13:12:23 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:12:34 INFO mapreduce.Job:  map 100% reduce 15%
16/02/16 13:12:37 INFO mapreduce.Job:  map 100% reduce 21%
16/02/16 13:12:40 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:28:26 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_1, Status : FAILED
Container [pid=21694,containerID=container_e24_1455198426748_0476_01_001310] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.6 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_001310 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 21694 21692 21694 21694 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stderr  
	|- 21781 21704 21694 21694 (R) 93564 1394 5118803968 1185913 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 21807 21781 21694 21694 (cat) 0 43 103407616 173 cat 
	|- 21704 21694 21694 21694 (java) 2526 787 8089718784 664117 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 
	|- 21810 21781 21694 21694 (cat) 0 0 103407616 166 cat 


Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


16/02/16 13:28:27 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:28:38 INFO mapreduce.Job:  map 100% reduce 16%
16/02/16 13:28:41 INFO mapreduce.Job:  map 100% reduce 20%
16/02/16 13:28:44 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:46:02 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_2, Status : FAILED
Container [pid=23643,containerID=container_e24_1455198426748_0476_01_001311] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.8 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_001311 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 23737 23729 23643 23643 (cat) 0 44 103407616 174 cat 
	|- 23738 23729 23643 23643 (cat) 0 0 103407616 166 cat 
	|- 23729 23653 23643 23643 (R) 101777 1652 5376724992 1248882 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 23653 23643 23643 23643 (java) 2328 784 8079331328 617129 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 
	|- 23643 23641 23643 23643 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stderr  


Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


16/02/16 13:46:03 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:46:15 INFO mapreduce.Job:  map 100% reduce 17%
16/02/16 13:46:18 INFO mapreduce.Job:  map 100% reduce 22%
16/02/16 13:46:21 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:59:00 INFO mapreduce.Job:  map 100% reduce 100%
16/02/16 13:59:00 INFO mapreduce.Job: Job job_1455198426748_0476 failed with state FAILED due to: Task failed task_1455198426748_0476_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1


16/02/16 13:59:00 INFO mapreduce.Job: Counters: 39
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=2064381938
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=13416462815
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=321
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Failed reduce tasks=4
		Launched map tasks=107
		Launched reduce tasks=4
		Data-local map tasks=107
		Total time spent by all maps in occupied slots (ms)=37720330
		Total time spent by all reduces in occupied slots (ms)=7956034
		Total time spent by all map tasks (ms)=18860165
		Total time spent by all reduce tasks (ms)=3978017
		Total vcore-seconds taken by all map tasks=18860165
		Total vcore-seconds taken by all reduce tasks=3978017
		Total megabyte-seconds taken by all map tasks=77251235840
		Total megabyte-seconds taken by all reduce tasks=28514425856
	Map-Reduce Framework
		Map input records=99256589
		Map output records=321
		Map output bytes=2050220619
		Map output materialized bytes=2050222738
		Input split bytes=12519
		Combine input records=321
		Combine output records=321
		Spilled Records=321
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=151580
		CPU time spent (ms)=4098800
		Physical memory (bytes) snapshot=256365596672
		Virtual memory (bytes) snapshot=538256474112
		Total committed heap usage (bytes)=286838489088
	File Input Format Counters 
		Bytes Read=13416450296
	rmr
		reduce calls=107
16/02/16 13:59:00 ERROR streaming.StreamJob: Job not successful!

I think its for the memory but i thinks its for the R program, because the same job from pig worked well, any suggestion??

Thanks

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Roberto Sancho did you look at my answer in your other question about adding a Combiner class and enabling compression from map stage as well as from reduce output.

mapreduce.map.output.compress
mapreduce.map.output.compress.codec

View solution in original post

20 REPLIES 20

avatar
Master Collaborator

Hi:

After that, still doenst work, all the mapper finished correctly but in the reducer stop in 67%

packageJobJar: [] [/usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-streaming-2.7.1.2.3.2.0-2950.jar] /tmp/streamjob2854666121307172018.jar tmpDir=null
16/02/17 09:57:07 INFO impl.TimelineClientImpl: Timeline service address: http://lnxbig06.cajarural.gcr:8188/ws/v1/timeline/
16/02/17 09:57:07 INFO client.RMProxy: Connecting to ResourceManager at lnxbig05.cajarural.gcr/10.1.246.19:8050
16/02/17 09:57:08 INFO impl.TimelineClientImpl: Timeline service address: http://lnxbig06.cajarural.gcr:8188/ws/v1/timeline/
16/02/17 09:57:08 INFO client.RMProxy: Connecting to ResourceManager at lnxbig05.cajarural.gcr/10.1.246.19:8050
16/02/17 09:57:08 INFO mapred.FileInputFormat: Total input paths to process : 14
16/02/17 09:57:08 INFO net.NetworkTopology: Adding a new node: /default-rack/10.1.246.17:50010
16/02/17 09:57:08 INFO net.NetworkTopology: Adding a new node: /default-rack/10.1.246.18:50010
16/02/17 09:57:08 INFO net.NetworkTopology: Adding a new node: /default-rack/10.1.246.16:50010
16/02/17 09:57:08 INFO net.NetworkTopology: Adding a new node: /default-rack/10.1.246.20:50010
16/02/17 09:57:09 INFO mapreduce.JobSubmitter: number of splits:107
16/02/17 09:57:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1455692856660_0019
16/02/17 09:57:09 INFO impl.YarnClientImpl: Submitted application application_1455692856660_0019
16/02/17 09:57:09 INFO mapreduce.Job: The url to track the job: http://lnxbig05.cajarural.gcr:8088/proxy/application_1455692856660_0019/
16/02/17 09:57:09 INFO mapreduce.Job: Running job: job_1455692856660_0019
16/02/17 09:57:15 INFO mapreduce.Job: Job job_1455692856660_0019 running in uber mode : false
16/02/17 09:57:15 INFO mapreduce.Job:  map 0% reduce 0%
16/02/17 09:57:39 INFO mapreduce.Job:  map 1% reduce 0%
16/02/17 09:57:40 INFO mapreduce.Job:  map 2% reduce 0%
16/02/17 09:57:41 INFO mapreduce.Job:  map 3% reduce 0%
16/02/17 09:57:42 INFO mapreduce.Job:  map 5% reduce 0%
16/02/17 09:57:43 INFO mapreduce.Job:  map 6% reduce 0%
16/02/17 09:57:44 INFO mapreduce.Job:  map 8% reduce 0%
16/02/17 09:57:45 INFO mapreduce.Job:  map 13% reduce 0%
16/02/17 09:57:46 INFO mapreduce.Job:  map 15% reduce 0%
16/02/17 09:57:47 INFO mapreduce.Job:  map 17% reduce 0%
16/02/17 09:57:48 INFO mapreduce.Job:  map 20% reduce 0%
16/02/17 09:57:49 INFO mapreduce.Job:  map 22% reduce 0%
16/02/17 09:57:50 INFO mapreduce.Job:  map 24% reduce 0%
16/02/17 09:57:51 INFO mapreduce.Job:  map 27% reduce 0%
16/02/17 09:57:52 INFO mapreduce.Job:  map 29% reduce 0%
16/02/17 09:57:53 INFO mapreduce.Job:  map 31% reduce 0%
16/02/17 09:57:54 INFO mapreduce.Job:  map 34% reduce 0%
16/02/17 09:57:55 INFO mapreduce.Job:  map 35% reduce 0%
16/02/17 09:57:56 INFO mapreduce.Job:  map 37% reduce 0%
16/02/17 09:57:57 INFO mapreduce.Job:  map 41% reduce 0%
16/02/17 09:57:58 INFO mapreduce.Job:  map 44% reduce 0%
16/02/17 09:57:59 INFO mapreduce.Job:  map 45% reduce 0%
16/02/17 09:58:00 INFO mapreduce.Job:  map 48% reduce 0%
16/02/17 09:58:01 INFO mapreduce.Job:  map 51% reduce 0%
16/02/17 09:58:03 INFO mapreduce.Job:  map 54% reduce 0%
16/02/17 09:58:04 INFO mapreduce.Job:  map 56% reduce 0%
16/02/17 09:58:06 INFO mapreduce.Job:  map 57% reduce 0%
16/02/17 09:58:07 INFO mapreduce.Job:  map 58% reduce 0%
16/02/17 10:01:16 INFO mapreduce.Job:  map 59% reduce 0%
16/02/17 10:01:21 INFO mapreduce.Job:  map 60% reduce 0%
16/02/17 10:01:24 INFO mapreduce.Job:  map 61% reduce 0%
16/02/17 10:01:28 INFO mapreduce.Job:  map 62% reduce 0%
16/02/17 10:01:31 INFO mapreduce.Job:  map 63% reduce 0%
16/02/17 10:01:32 INFO mapreduce.Job:  map 64% reduce 0%
16/02/17 10:01:33 INFO mapreduce.Job:  map 65% reduce 0%
16/02/17 10:01:35 INFO mapreduce.Job:  map 66% reduce 0%
16/02/17 10:01:37 INFO mapreduce.Job:  map 67% reduce 0%
16/02/17 10:01:38 INFO mapreduce.Job:  map 68% reduce 0%
16/02/17 10:01:40 INFO mapreduce.Job:  map 69% reduce 0%
16/02/17 10:01:42 INFO mapreduce.Job:  map 70% reduce 8%
16/02/17 10:01:43 INFO mapreduce.Job:  map 71% reduce 8%
16/02/17 10:01:44 INFO mapreduce.Job:  map 73% reduce 8%
16/02/17 10:01:46 INFO mapreduce.Job:  map 73% reduce 9%
16/02/17 10:01:47 INFO mapreduce.Job:  map 74% reduce 9%
16/02/17 10:01:48 INFO mapreduce.Job:  map 75% reduce 9%
16/02/17 10:01:49 INFO mapreduce.Job:  map 76% reduce 10%
16/02/17 10:01:50 INFO mapreduce.Job:  map 77% reduce 10%
16/02/17 10:01:51 INFO mapreduce.Job:  map 78% reduce 10%
16/02/17 10:01:52 INFO mapreduce.Job:  map 80% reduce 12%
16/02/17 10:01:53 INFO mapreduce.Job:  map 82% reduce 12%
16/02/17 10:01:55 INFO mapreduce.Job:  map 84% reduce 15%
16/02/17 10:01:56 INFO mapreduce.Job:  map 86% reduce 15%
16/02/17 10:01:57 INFO mapreduce.Job:  map 88% reduce 15%
16/02/17 10:01:58 INFO mapreduce.Job:  map 90% reduce 21%
16/02/17 10:01:59 INFO mapreduce.Job:  map 92% reduce 21%
16/02/17 10:02:00 INFO mapreduce.Job:  map 93% reduce 21%
16/02/17 10:02:01 INFO mapreduce.Job:  map 93% reduce 25%
16/02/17 10:02:02 INFO mapreduce.Job:  map 94% reduce 25%
16/02/17 10:02:03 INFO mapreduce.Job:  map 96% reduce 25%
16/02/17 10:02:04 INFO mapreduce.Job:  map 96% reduce 29%
16/02/17 10:02:06 INFO mapreduce.Job:  map 97% reduce 29%
16/02/17 10:02:07 INFO mapreduce.Job:  map 97% reduce 30%
16/02/17 10:02:19 INFO mapreduce.Job:  map 97% reduce 31%
16/02/17 10:02:25 INFO mapreduce.Job:  map 98% reduce 31%
16/02/17 10:02:37 INFO mapreduce.Job:  map 98% reduce 32%
16/02/17 10:02:38 INFO mapreduce.Job:  map 99% reduce 32%
16/02/17 10:02:52 INFO mapreduce.Job:  map 99% reduce 33%
16/02/17 10:02:56 INFO mapreduce.Job:  map 100% reduce 33%
16/02/17 10:03:01 INFO mapreduce.Job:  map 100% reduce 67%

avatar
Master Collaborator

Also this error:

16/02/17 00:09:02 INFO mapreduce.Job:  map 100% reduce 0%
16/02/17 00:09:13 INFO mapreduce.Job:  map 100% reduce 67%
16/02/17 00:18:53 INFO mapreduce.Job: Task Id : attempt_1455662313758_0003_r_000000_2, Status : FAILED
Container [pid=24748,containerID=container_e34_1455662313758_0003_01_000111] is running beyond physical memory limits. Current usage: 3.5 GB of 3.5 GB physical memory used; 10.7 GB of 14 GB virtual memory used. Killing container.
Dump of the process-tree for container_e34_1455662313758_0003_01_000111 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 24748 24745 24748 24748 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.20 52488 attempt_1455662313758_0003_r_000000_2 37383395344495 1>/hadoop/yarn/log/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111/stdout 2>/hadoop/yarn/log/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111/stderr  
	|- 24847 24835 24748 24748 (cat) 0 22 103407616 173 cat 
	|- 24758 24748 24748 24748 (java) 1124 305 8068534272 227250 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455662313758_0003/container_e34_1455662313758_0003_01_000111 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.20 52488 attempt_1455662313758_0003_r_000000_2 37383395344495 
	|- 24835 24758 24748 24748 (R) 57497 677 3131715584 700751 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-reduce5178597eff5e 
	|- 24851 24835 24748 24748 (cat) 0 0 103407616 165 cat 

avatar
Master Collaborator
Also how can i set this parameter??

16/02/17 09:57:09 INFO mapreduce.JobSubmitter: number of splits:107

Thanks alot

avatar
Master Collaborator
Also how can i set this parameter??

16/02/17 09:57:09 INFO mapreduce.JobSubmitter: number of splits:107

Thanks alot

avatar
Master Mentor

@Roberto Sancho What paramter do you want to set?

avatar
Master Mentor
@Roberto Sancho

you do not control number of splits API handles that. You're number of splits is determined by number of blocks in your dataset and size of block.

avatar
Master Collaborator

Hi:

after the execution,i saw this graphic but i think its normal??? i dont know why at this moment after minutes get the error

16/02/17 10:02:52 INFO mapreduce.Job:  map 99% reduce 33%16/02/17 10:02:56 INFO mapreduce.Job:  map 100% reduce 33%16/02/17 10:03:01 INFO mapreduce.Job:  map 100% reduce 67%

2213-error.png

avatar
Master Collaborator

Hi:

I change this parameter and now the job finished after 32 minutes but, Still I dont know why from 96% to 100 % the reducer long time llok:

16/02/17 17:46:32 INFO mapreduce.Job: Running job: job_1455727501370_0001
16/02/17 17:46:39 INFO mapreduce.Job: Job job_1455727501370_0001 running in uber mode : false
16/02/17 17:46:39 INFO mapreduce.Job:  map 0% reduce 0%
.
16/02/17 17:53:29 INFO mapreduce.Job:  map 100% reduce 92%
16/02/17 17:53:31 INFO mapreduce.Job:  map 100% reduce 93%
16/02/17 17:53:46 INFO mapreduce.Job:  map 100% reduce 96%
"and now after 30 minute will finifhed"

The parameter i changed are:

mapreduce.job.reduce.slowstart.completedmaps=0,8
mapreduce.reduce.shuffle.parallelcopies
mapreduce.reduce.shuffle.input.buffer.percent
mapreduce.reduce.shuffle.merge.percent


  
FROM RStudio
rmr.options(backend.parameters = list(
hadoop = list(D = "mapreduce.map.memory.mb=4096",
                D = "mapreduce.job.reduces=7",
                D="mapreduce.reduce.memory.mb=5120"

Any more parameter that it can help me???

Thanks

avatar
Master Mentor

@Roberto Sancho did you look at my answer in your other question about adding a Combiner class and enabling compression from map stage as well as from reduce output.

mapreduce.map.output.compress
mapreduce.map.output.compress.codec

avatar
Master Collaborator

Hi, yes i used the compress map output, i forget to comment, but still i didnt use combinner class, ill try and ill tell you.

Many many thanks.