Support Questions
Find answers, ask questions, and share your expertise

YARN recommended configuration

Rising Star

Hi. I have a qestion regarding hdp-conf-utils script and Ambari recommendations. I installed 8 node managers on my cluster. Node hardware spec: cores - 4, ram - 15, disk - 4.

I execute hdp-conf-utils.py script and I got something like that:

Using cores=4 memory=15GB disks=4 hbase=False
 Profile: cores=4 memory=14336MB reserved=1GB usableMem=14GB disks=4
 Num Container=8
 Container Ram=1792MB
 Used Ram=14GB
 Unused Ram=1GB
 ***** mapred-site.xml *****
 mapreduce.map.memory.mb=1792
 mapreduce.map.java.opts=-Xmx1280m
 mapreduce.reduce.memory.mb=3584
 mapreduce.reduce.java.opts=-Xmx2560m
 mapreduce.task.io.sort.mb=640
 ***** yarn-site.xml *****
 yarn.scheduler.minimum-allocation-mb=1792
 yarn.scheduler.maximum-allocation-mb=14336
 yarn.nodemanager.resource.memory-mb=14336
 yarn.app.mapreduce.am.resource.mb=1792
 yarn.app.mapreduce.am.command-opts=-Xmx1280m
 ***** tez-site.xml *****
 tez.am.resource.memory.mb=3584
 tez.am.java.opts=-Xmx2560m
 ***** hive-site.xml *****
 hive.tez.container.size=1792
 hive.tez.java.opts=-Xmx1280m
 hive.auto.convert.join.noconditionaltask.size=402653000

I wanted to set this recommendations to YARN, but Ambari recommends me something else:

yarn.nodemanager.resource.memory-mb=5120
yarn.scheduler.minimum-allocation-mb=512
yarn.scheduler.maximum-allocation-mb=5120

Can anyone explain me why Ambari and hdp-conf-utils recommend something else? I would be really grateful.

1 ACCEPTED SOLUTION

Accepted Solutions

Super Guru

@Mateusz Grabowski

those recommendations are based on smartsense analysis of your cluster. Smartsense uses machine learning and data analysis from hundreds of clusters and tuned them based on what it has seen in the past on other hundreds of clusters suggests optimizations. If you are sure about your settings and know about your workload and know what you are doing, then you should go with that. Here is a little article that explains how Smart sense comes up with tuning your cluster and optimizing your hardware for best use.

https://hortonworks.com/blog/case-study-2x-hadoop-performance-with-hortonworks-smartsense-webinar-re...

View solution in original post

3 REPLIES 3

Super Guru

@Mateusz Grabowski

those recommendations are based on smartsense analysis of your cluster. Smartsense uses machine learning and data analysis from hundreds of clusters and tuned them based on what it has seen in the past on other hundreds of clusters suggests optimizations. If you are sure about your settings and know about your workload and know what you are doing, then you should go with that. Here is a little article that explains how Smart sense comes up with tuning your cluster and optimizing your hardware for best use.

https://hortonworks.com/blog/case-study-2x-hadoop-performance-with-hortonworks-smartsense-webinar-re...

View solution in original post

Rising Star
@mqureshi

But the problem is that I don't have subscription so I don't have access to Smartsense. What in that case?

Super Guru

@Mateusz Grabowski

Check in the components installed and I am sure smartsense is still installed. Hortonworks will not be collecting data but smartsense is still doing its job. Check in the list of components you are installing and I am sure it is still installed.