<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to properly migrate data (HDFS+HBase) from an existing cluster to a new one in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215471#M177381</link>
    <description>&lt;P&gt;hey &lt;A rel="user" href="https://community.cloudera.com/users/62367/jcastrilli.html" nodeid="62367"&gt;@Juan Castrilli&lt;/A&gt; let the community know how did you proceed &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 12 Mar 2018 12:51:46 GMT</pubDate>
    <dc:creator>sanketplus</dc:creator>
    <dc:date>2018-03-12T12:51:46Z</dc:date>
    <item>
      <title>How to properly migrate data (HDFS+HBase) from an existing cluster to a new one</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215468#M177378</link>
      <description>&lt;P&gt;Hi, Folks.&lt;BR /&gt;We have a Hadoop cluster with ~1.5 PB of data (i.e. ~1500 TB), running on bare metal with CDH 5.7 and without Cloudera Manager. We're planning to decommission the cluster and set up a new one from the scratch (bare metal as well, not cloud), problably switching to Hortonworks (HDP) this time. We're also moving the whole datacenter where it's currently located, so the new one will be on a different location. The idea is to keep all the data (all 1.5 PB of data is relevant, so unfortunately we can't get rid of anything). Just to clarify, we're talking about HDFS data as well as HBase databases/tables.&lt;BR /&gt;&lt;BR /&gt;That being said, my question is:&lt;BR /&gt;Assuming we have our brand-new cluster set up and ready to ingest the data, what would be the best method to migrate all 1.5 PB of it to the new one? Needless to say we need to have the least possible downtime while doing all this.&lt;BR /&gt;&lt;BR /&gt;Below is our current cluster's resources:&lt;BR /&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;2 NameNodes in HA --&amp;gt; 2.80GHz 6-core / 24GB RAM&lt;/LI&gt;&lt;/UL&gt;&lt;UL&gt;&lt;LI&gt;49 DataNodes:&lt;UL&gt;&lt;LI&gt;5 of them --&amp;gt; 2.4GHz 6-cores / 72GB RAM&lt;/LI&gt;&lt;LI&gt;38 of them --&amp;gt; 2.3GHz 16-cores / 128GB RAM&lt;/LI&gt;&lt;LI&gt;6 of them --&amp;gt; 2.4GHz 32-cores / 128GB RAM&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;BR /&gt;Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Feb 2018 23:36:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215468#M177378</guid>
      <dc:creator>jcastrilli1</dc:creator>
      <dc:date>2018-02-08T23:36:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to properly migrate data (HDFS+HBase) from an existing cluster to a new one</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215469#M177379</link>
      <description>&lt;P&gt;While I don't want to oversimplify this process nor not suggest that Hortonworks Professional Services doesn't do these conversions with customers all the time (there is often more at play than simply moving the data, such as testing apps before &amp;amp; after), but... you can leverage DistCp, &lt;A href="https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html"&gt;https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html&lt;/A&gt;, as your tool to move the data from your original cluster to your new one.&lt;/P&gt;&lt;P&gt;For the HBase data, I'd look to its Snapshots feature, &lt;A href="http://hbase.apache.org/book.html#ops.snapshots" target="_blank"&gt;http://hbase.apache.org/book.html#ops.snapshots&lt;/A&gt;, including its ability to export the snapshot to another cluster, as a solid approach.&lt;/P&gt;&lt;P&gt;Good luck and happy Hadooping!&lt;/P&gt;</description>
      <pubDate>Wed, 14 Feb 2018 06:38:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215469#M177379</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2018-02-14T06:38:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to properly migrate data (HDFS+HBase) from an existing cluster to a new one</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215470#M177380</link>
      <description>&lt;P&gt;Hi, Lester.&lt;/P&gt;&lt;P&gt;Thanks for your response. I didn't know about &lt;STRONG&gt;HBase's snapshot&lt;/STRONG&gt; feature. I'll dig into it.&lt;BR /&gt;Regarding &lt;STRONG&gt;distcp&lt;/STRONG&gt;, I was also thinking about using it, although I'm not sure how much time it will take to copy all the data, but I'll definitely check on it as well.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Best regards,&lt;EM&gt;&lt;BR /&gt;Juan&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Feb 2018 23:01:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215470#M177380</guid>
      <dc:creator>jcastrilli1</dc:creator>
      <dc:date>2018-02-14T23:01:03Z</dc:date>
    </item>
    <item>
      <title>Re: How to properly migrate data (HDFS+HBase) from an existing cluster to a new one</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215471#M177381</link>
      <description>&lt;P&gt;hey &lt;A rel="user" href="https://community.cloudera.com/users/62367/jcastrilli.html" nodeid="62367"&gt;@Juan Castrilli&lt;/A&gt; let the community know how did you proceed &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Mar 2018 12:51:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-properly-migrate-data-HDFS-HBase-from-an-existing/m-p/215471#M177381</guid>
      <dc:creator>sanketplus</dc:creator>
      <dc:date>2018-03-12T12:51:46Z</dc:date>
    </item>
  </channel>
</rss>

