Hi All,
We have total of 3.5 TB of RAM and We are facing a problem of memory utilization .
Problem description :
We have 5 HQL (Each HQL has multiple sub query ) running in parallel .
One of the query is occupying 3.3 TB of memory and rest are sitting idle in the queue and taking lot of time to complete .
We need some recommendation or tuning so that all queries gets equal chunk of memory .
Parameters set in our query
SET hive.exec.compress.output=true ;SET hive.exec.compress.intermediate=true ;
SET mapred.output.compress=true ;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec ;
SET mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec ;
SET io.seqfile.compression.type=BLOCK;
SET io.sort.mb=500 ;
SET dfs.block.size=536870912;
SET io.file.buffer.size=131072;
SET mapred.compress.map.output=true;
SET mapred.output.compression.type=BLOCK;
SET hive.auto.convert.join=true;
SET mapreduce.map.memory.mb=12288;
SET mapreduce.map.java.opts =-Xmx9831m;
SET mapreduce.reduce.java.opts=-Xmx8192m;
SET mapreduce.reduce.memory.mb=10240;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.allow-drop-table=true;
Attached screen shot as well for reference .
memory.jpg
Quick Help is appreciated !!