05-28-2017 07:50 PM
I got the error as below.
It seems we need to set memory for physical memory , but i can't find the configation about it.
Where can i set physical memory?
But the way , i just run some simple insert query in HIVE,
and this error not happened on every time , i want to get the reason.
Diagnostic Messages for this Task:
Container [pid=51661,containerID=container_e50_1493005386967_25486_01_000243] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_e50_1493005386967_25486_01_000243 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 51661 51659 51661 51661 (bash) 0 0 108662784 305 /bin/bash -c /usr/java/jdk1.8.0_101/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN.............
05-31-2017 10:29 AM - edited 05-31-2017 10:30 AM
The setting mapreduce.map.memory.mb will set the physical memory size of the container running the mapper (mapreduce.reduce.memory.mb will do the same for the reducer container).
Besure that you adjust the heap value as well. In newer version of YARN/MRv2 the setting mapreduce.job.heap.memory-mb.ratio can be used to have it auto-adjust. The default is .8, so 80% of whatever the container size is will be allocated as the heap. Otherwise, adjust manually using mapreduce.map.java.opts.max.heap and mapreduce.reduce.java.opts.max.heap settings.
BTW, I believe that 1 GB is the default and it is quite low. I recommend reading the below link. It provides a good understanding of YARN and MR memory setting, how they relate, and how to set some baseline settings based on the cluster node size (disk, memory, and cores).
12-21-2018 12:55 AM
I can't see the relationship between yarn.scheduler.minimum-allocation-mb and the error is reported.
According to hive documentation, yarn.scheduler.minimum-allocation-mb is the "container memory minimum". But in this case, the container is running of memory, so it makes sense to increase the "maximum-allocation" instead.
Anyway, as it was answered, increasing "mapreduce.map.memory.mb" and "mapreduce.reduce.memory.mb" must work, as those parameters controls how much memory is used by the map-reduce task is run by Hive.