<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Rack awareness during HDFS replication in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300529#M220278</link>
    <description>&lt;P&gt;I was afraid of that. Yes, I am using distcp for migration. Thanks very much nevertheless for your reply. The bandwidth option might be a very last resort, but probably, that will have to do.&lt;/P&gt;</description>
    <pubDate>Wed, 29 Jul 2020 10:20:03 GMT</pubDate>
    <dc:creator>classypie</dc:creator>
    <dc:date>2020-07-29T10:20:03Z</dc:date>
    <item>
      <title>Rack awareness during HDFS replication</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300007#M219988</link>
      <description>&lt;P&gt;We are building a new Cloudera cluster and replicating the HDFS data from an existing cluster. This existing cluster is on two sites and the rack awareness is configured accordingly, with a default replication factor of 3.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If we are building this new cluster at one of these two sites, is it possible to ensure that HDFS is replicating from the same physical location and not from the other site? The background: we don't want to cause a big network load between the two sites if all the data is already locally available.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2020 11:23:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300007#M219988</guid>
      <dc:creator>classypie</dc:creator>
      <dc:date>2020-07-20T11:23:25Z</dc:date>
    </item>
    <item>
      <title>Re: Rack awareness during HDFS replication</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300026#M219998</link>
      <description>&lt;P&gt;Are you using &lt;EM&gt;distcp&lt;/EM&gt; for migration? If reducing heavy load&amp;nbsp;on network is your requirement and you are ok with the migration taking longer, then there is a&amp;nbsp;&lt;A href="https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html" target="_self"&gt;&lt;SPAN&gt;-bandwidth&lt;/SPAN&gt; option in distcp&lt;/A&gt; that can help. You can specify the maximum bandwith a map operation can use. You'd of course first need to estimate the number of map operations to be executed. Otherwise, I'm not aware of any rack aware hdfs migration approach.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2020 16:28:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300026#M219998</guid>
      <dc:creator>aakulov</dc:creator>
      <dc:date>2020-07-20T16:28:09Z</dc:date>
    </item>
    <item>
      <title>Re: Rack awareness during HDFS replication</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300529#M220278</link>
      <description>&lt;P&gt;I was afraid of that. Yes, I am using distcp for migration. Thanks very much nevertheless for your reply. The bandwidth option might be a very last resort, but probably, that will have to do.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jul 2020 10:20:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Rack-awareness-during-HDFS-replication/m-p/300529#M220278</guid>
      <dc:creator>classypie</dc:creator>
      <dc:date>2020-07-29T10:20:03Z</dc:date>
    </item>
  </channel>
</rss>

