- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HOW TO CALCULATE THE HEAPSIZE FOR DATANODE?
- Labels:
-
Apache Ambari
-
Apache Hadoop
Created 07-04-2017 06:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are lot of articles for NameNode heap calculation, but none on DataNode.
1. How to calculate the DataNode heap size?
2. How to calculate the object size of each Object in the DataNode Heap?
3. What does the Metadata of the DataNode heap contains? It cannot be similar to NameNode (as it does not have replication details etc. ), also, it should have metadata for checksum stored etc, so how does metadata of DataNode looks like. How is it different from NameNode Metadata?
Created 07-06-2017 01:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great question and unfortunately, I don't think there is a well agreed upon formula/calculator out there as "it depends" is so often the rule. Some considerations are that the datanode doesn't really know about the directory structure; it just stores (and copies, deletes, etc) blocks as directed by the datanode (often indirectly since clients write actual blocks). Additionally, the checksums at the block level are actually stored on disk alongside the files for the data contained in a given block.
It looks like there's some good info in the following HCC Q's that might be of help to you.
https://community.hortonworks.com/questions/64677/datanode-heapsize-computation.html
https://community.hortonworks.com/questions/45381/do-i-need-to-tune-java-heap-size.html
https://community.hortonworks.com/questions/78981/data-node-heap-size-warning.html
Good luck and happy Hadooping!
