Created 07-15-2016 09:36 AM
for the namenode and for datanodes and for yarn/spark? Or is the default provided by Ambari suitable for production use?
Created 07-15-2016 10:09 AM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 07-15-2016 10:00 AM
Tuning java heap size completely depends on your usecase. Are you seeing any performance related issues with your current heap configs ?
Here is the recommendation from hortonworks for namenode : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_installing_manually_book/content/ref-809...
Created 07-15-2016 10:09 AM
Want to get a detailed solution you have to login/registered on the community
Register/LoginCreated 07-17-2016 06:03 PM
Hi @Kartik Vashishta, I can answer for the HDFS services.
The NameNode heap size depends on the total number of file system objects that you have (files/blocks). The exact heap tuning recommendations are documented in the HDP manual install section (same link that @Sandeep Nemuri provided in another answer). I recommend checking that the Ambari configured values are in line with these recommendations since misconfigured heap settings affect NameNode performance significantly. Also the heap size requirements change with time as cluster usage grows.
The DataNode heap size requirement depends on the total number of blocks on each DataNode. The default 1GB heap is insufficient for larger capacity DataNodes. We now recommend using a heap size of 4GB for DataNodes as Benjamin suggested.
Ensuring you have GC logging enabled for your services is a good idea. There is an HCC article on NameNode heap tuning that goes into a lot more detail on related topics.
Created 07-16-2019 03:24 PM
@Arpit is right that you should do an actual calculation for the namenode heap and keep that up to date as your data grows. I know this thread is about datanodes, but since namenode was brought up multiple times, I just want to point out that Cloudera recommends 1GB per million files+blocks as a good starting point. Once you get to many millions of files and blocks, you can reduce it but start there.