- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can we check how many namenode handler count are used in a cloudera cluster?
- Labels:
-
Apache Hadoop
Created on
‎10-23-2019
08:28 AM
- last edited on
‎10-23-2019
09:38 AM
by
cjervis
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have set the value to 140 in our environment. I want to know if there is a chart or any other way to know the number of namenode handler used?
Created ‎10-23-2019 09:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Basically handler counts are the number of threads which is responsible to process the request for hadoop RPC server.
Best practice is to set the RPC handler count property dfs.namenode.handler.count to 20*log2(number of datanodes) with an upper limit of 200.
If you have 12 datanode then number of threads should be - 20*log2(12) = 80 (dfs.namenode.handler.count = 80) max you can increase it to 100.
If you have specified 140 then all the handlers will be active to process the request.
You can verify it by taking a thread dump of namenode process and check the total number of handlers and what they are doing.
You can take a thread dump by running $ kill -3 <namenode_pid>
This will take a thread dump of Namenode process in .out file under /var/log/hadoop/hdfs (Namenode log directory.)
Another method is to check the Grafana Dashboard (If you are using Ambari ) and check the RPC queue metrics for Namenode.
Created ‎10-31-2019 07:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your response!
1. The math value for name node handler count according to Cloudera doc
dfs.namenode.service.handler.count and dfs.namenode.handler.count - For each NameNode, set to ln(number of DataNodes in this HDFS service) * 20.
As per you said for example 12 data nodes it should be ln(12)*20 = 50
But you say to follow this 20*log2(12) formula? can you cross-check and let me know on this.
2. I have the dump but can you walk me how to check how many handlers running?
