<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: What are the best practices and recommendations for adding more datanodes to the large clusters in production? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94462#M7754</link>
    <description>&lt;P&gt;HDFS Balancer can run in the background and there is a controllable bandwidth that it consumes. In general, on a large cluster it can run continuously, but it is a must after adding new nodes to have a healthy system. Note for large clusters a single convergence run can be a full day or more (that shouldn't scare you away though), let it run.&lt;/P&gt;&lt;P&gt;Also, some customers reported that had more stable experience when adding nodes in small batches of a few instead of adding a full rack at once, for example.&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2015 20:49:37 GMT</pubDate>
    <dc:creator>andrewg</dc:creator>
    <dc:date>2015-09-29T20:49:37Z</dc:date>
    <item>
      <title>What are the best practices and recommendations for adding more datanodes to the large clusters in production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94460#M7752</link>
      <description />
      <pubDate>Tue, 29 Sep 2015 08:51:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94460#M7752</guid>
      <dc:creator>pardeep_kumar</dc:creator>
      <dc:date>2015-09-29T08:51:01Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices and recommendations for adding more datanodes to the large clusters in production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94461#M7753</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/139/pardeepkumar.html" nodeid="139"&gt;@pardeep.kumar@hortonworks.com&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Listing some them which i am aware of .&lt;/P&gt;&lt;P&gt;1). You could add either using Ambari Blueprints (https://cwiki.apache.org/confluence/display/AMBARI/Blueprints#Blueprints-AddingHoststoanExistingCluster) or using Ambari.Blueprint is much easier to do.&lt;/P&gt;&lt;P&gt;2). After adding the data nodes  run HDFS Balancer during quiet time.&lt;/P&gt;&lt;P&gt;3). Adjust the dfs.namenode.handler.count to ln(no of DNs)* 20&lt;/P&gt;&lt;P&gt;4). Adjust the dfs.namenode.service.handler to ln(no of DNs)* 20.&lt;/P&gt;&lt;P&gt;ln is log of.&lt;/P&gt;&lt;P&gt;Others can add /correct the recomendations.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2015 16:25:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94461#M7753</guid>
      <dc:creator>Jagatheeshr</dc:creator>
      <dc:date>2015-09-29T16:25:55Z</dc:date>
    </item>
    <item>
      <title>Re: What are the best practices and recommendations for adding more datanodes to the large clusters in production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94462#M7754</link>
      <description>&lt;P&gt;HDFS Balancer can run in the background and there is a controllable bandwidth that it consumes. In general, on a large cluster it can run continuously, but it is a must after adding new nodes to have a healthy system. Note for large clusters a single convergence run can be a full day or more (that shouldn't scare you away though), let it run.&lt;/P&gt;&lt;P&gt;Also, some customers reported that had more stable experience when adding nodes in small batches of a few instead of adding a full rack at once, for example.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2015 20:49:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-are-the-best-practices-and-recommendations-for-adding/m-p/94462#M7754</guid>
      <dc:creator>andrewg</dc:creator>
      <dc:date>2015-09-29T20:49:37Z</dc:date>
    </item>
  </channel>
</rss>

