<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Hadoop Balancer in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164993#M53721</link>
    <description>&lt;P&gt;are you asking about balancing HDFS so that all nodes are evenly distributed? In that case you need to run HDFS balancer and it will spread that node's data across nodes, otherwise it defeats the point of balancing out HDFS. You can use -include -f hostsfile to hint which nodes to run balancer against but idea is you need more than one node there, is meant to be across multiple datanodes not single datanode. &lt;/P&gt;&lt;P&gt;&lt;A href="https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer" target="_blank"&gt;https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer&lt;/A&gt;&lt;/P&gt;&lt;P&gt;If your question is to balance data across all disks on one node, you can use disk balancer, sadly it's a new feature in Hadoop 3. &lt;A href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html" target="_blank"&gt;https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 08 Feb 2017 11:13:28 GMT</pubDate>
    <dc:creator>aervits</dc:creator>
    <dc:date>2017-02-08T11:13:28Z</dc:date>
    <item>
      <title>Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164992#M53720</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;How to run Hadoop Balancer on a single node, in our present environment one of the data node is not balanced (disk usage at 99%) remaining all data nodes are at 55%. We are using HDP version(2.3.4)&lt;/P&gt;&lt;P&gt;We want to balance only that single node which is at 99%.&lt;/P&gt;&lt;P&gt;Thanks in Advance.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 10:31:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164992#M53720</guid>
      <dc:creator>reddyr211</dc:creator>
      <dc:date>2017-02-08T10:31:43Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164993#M53721</link>
      <description>&lt;P&gt;are you asking about balancing HDFS so that all nodes are evenly distributed? In that case you need to run HDFS balancer and it will spread that node's data across nodes, otherwise it defeats the point of balancing out HDFS. You can use -include -f hostsfile to hint which nodes to run balancer against but idea is you need more than one node there, is meant to be across multiple datanodes not single datanode. &lt;/P&gt;&lt;P&gt;&lt;A href="https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer" target="_blank"&gt;https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer&lt;/A&gt;&lt;/P&gt;&lt;P&gt;If your question is to balance data across all disks on one node, you can use disk balancer, sadly it's a new feature in Hadoop 3. &lt;A href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html" target="_blank"&gt;https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 11:13:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164993#M53721</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2017-02-08T11:13:28Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164994#M53722</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We ran the HDFS balancer on one of the node in our environment but it fails to run on single node rather it started on entire cluster. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;We ran the balancer on single node by adding "-source -f HOSTS.TXT" to the command &lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 13:02:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164994#M53722</guid>
      <dc:creator>reddyr211</dc:creator>
      <dc:date>2017-02-08T13:02:49Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164995#M53723</link>
      <description>&lt;P&gt;balancer is not for single node. It is for balancing load on the cluster. For balancing load among different disks on the same node, a new disk balancer will be available in Hadoop version 3.0 as the link Artem shared, shows. There is not much you would be able to do here, except for the fact that "don't worry, Hadoop is smart enough to know that particular disk doesn't have any more space and it will find another disk" &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 13:06:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164995#M53723</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2017-02-08T13:06:28Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164996#M53724</link>
      <description>&lt;P&gt;@Vijay&lt;/P&gt;&lt;P&gt;Hdfs balancer run across all the datanodes in appropriate distributing of the blocks, and the hadoop have the intelligent to shuffle the blocks in organized manner. so the balancer is not limited to sing node, you can just include and exclude the node for balancing. if you initiate the balancer in one node by excluding all but one, then its just like organizing the block on same node , so your disk utility will never reduce.&lt;/P&gt;&lt;P&gt;In your case just run the balancer accross the cluster, it will reduce the disk utility on the node, which is 99% filled.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 17:52:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164996#M53724</guid>
      <dc:creator>rajsyrus</dc:creator>
      <dc:date>2017-02-08T17:52:58Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164997#M53725</link>
      <description>&lt;P&gt;You also use -include -f with more than one host, not just that single datanode, I thought I was clear on that.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 20:34:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164997#M53725</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2017-02-08T20:34:31Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164998#M53726</link>
      <description>&lt;P&gt;Thank you guys &lt;A rel="user" href="https://community.cloudera.com/users/10969/mqureshi.html" nodeid="10969"&gt;@mqureshi&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/3729/rajsyrus.html" nodeid="3729"&gt;@Rajendra M&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 00:29:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164998#M53726</guid>
      <dc:creator>reddyr211</dc:creator>
      <dc:date>2017-02-09T00:29:10Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164999#M53727</link>
      <description>&lt;P&gt;absolutely, happy to help.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 00:53:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/164999#M53727</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2017-02-09T00:53:05Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/165000#M53728</link>
      <description>&lt;P&gt;welcome,Thank you.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Feb 2017 15:02:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/165000#M53728</guid>
      <dc:creator>rajsyrus</dc:creator>
      <dc:date>2017-02-13T15:02:45Z</dc:date>
    </item>
    <item>
      <title>Re: Hadoop Balancer</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/282260#M53729</link>
      <description>&lt;P&gt;In order to run a balancer on only one data node&lt;/P&gt;&lt;P&gt;hdfs balancer -include -f &amp;lt;#datanode name can be specified&amp;gt;&lt;/P&gt;&lt;P&gt;this would balance the data load on that particular DN.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Nov 2019 18:09:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hadoop-Balancer/m-p/282260#M53729</guid>
      <dc:creator>SreeBalaje</dc:creator>
      <dc:date>2019-11-06T18:09:26Z</dc:date>
    </item>
  </channel>
</rss>

