Member since
09-03-2014
14
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5327 | 09-08-2014 06:21 AM |
05-06-2015
09:59 PM
Hi, The "clean" was is to decommission the node to make sure nothing is going into this node anymore and not any block is at risk. However, if what you want to have is 2 blocks only, then increasing the replication to 3, waiting for all the blocks to be fully replicated and then just stopping a datanode before moving it will allow you to always have the minimum of 2 blocks you are looking for. And this will be way faster. But not as clean as doing the decommission. Also, if you move the datanode fast enought, most of the blocks on it will simply be re-used when it will be re-connected. JM
... View more
12-19-2014
11:34 AM
Hi Mugi, The docs I linked offer some guidance for things like max connections, which can help determine the sizing of your DB hardware. Usually the DB load from CM and monitoring is not extreme, but for 20+ node clusters we recommend splitting all monitoring roles and databases to be on a separate host than CM server to spread the load. How many can be shared on a host depends on cluster size. I'm frankly not an expert in this area, but others may be able to chime in. You can also of course get official support for your cluster and get reliable access to this kind of expertise (</shameless plug>). Thanks, Darren
... View more
09-08-2014
06:21 AM
Hi Gautam, Thank's for quick response. As i told you before we have problem with HDFS block, currently our HDFS block reach to 58 millions with storage occupied equal to 500TB it's not quite ideal. NN memory capacity is about 64GB and we set 32GB for NN heap. Last time we change dfs.namenode.checkpoint.txns at the same time with dfs.image.compress from default 44K to 1000K because, we thought when the system often do checkpoint thats lead namenode service become bad as CM report through email. About your question JT pause only began when you turned on compression of fsimage?. We not sure about that cause the mapreduce pause never like now until 10 minutes, whether it happens or not we do not notice. Does only increase NN heap memory or there are other alternatives that we can tune related with hadoop parameters will reduce load HDFS and will bring back mapreduce to normal during checkpoint ? regards, i9um0
... View more