<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283073#M210396</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What i got from your scenario is on second scratch installation your master nodes [ie. active/standby name-node ] are fresh installed and you are only adding the datanodes which are having pre-existing data [from other cluster]..right!!&lt;/P&gt;&lt;P&gt;-- In this case its not possible to get the cluster up with new data from the HDD which was restored.&lt;/P&gt;&lt;P&gt;Since namenode will not have any information about the blocks lying in blockstorage on the datanode disk.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you have opted support from Cloudera then you can approach them for DR scenario where they can help you to get existing data from datanodes to be added back in cluster[not sure if it can be recovered/added back 100%]&lt;/P&gt;&lt;P&gt;Same for kafka.&lt;/P&gt;</description>
    <pubDate>Fri, 15 Nov 2019 10:06:59 GMT</pubDate>
    <dc:creator>sagarshimpi</dc:creator>
    <dc:date>2019-11-15T10:06:59Z</dc:date>
    <item>
      <title>how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283068#M210391</link>
      <description>&lt;P&gt;hi all&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I want to ask this important question&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;lets say we have the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;HDP cluster with&amp;nbsp; :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3 masters machine ( active/standby name-node ) , ( active/standby resource manager )&lt;/P&gt;
&lt;P&gt;3 datanode machines&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;- each data-node machine have 4 disks for HDFS ( not include the OS )&lt;/P&gt;
&lt;P&gt;3 kafka machines&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;- each kafka machine have one disk of 10T&amp;nbsp;( not include the OS )&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;now we want to install from scratch all cluster include HDP and ambari&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;but save the data on datanode machines and kafka topics data machine by the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;we umount the disks on datanode machines and kafka machines&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;on datanode machine ( note - /etc/stab is already configured )&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;umount /grid/data1&lt;/P&gt;
&lt;P&gt;umount /grid/data2&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;so the second scratch installation we install all the cluster ( by blueprint ) , but without data-node HDFS disks , and kafka&amp;nbsp; topic disk ( scratch installation means fresh new linux OS )&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;after installation we mount all data-node machines disks and kafka disks machines ( where we are store all topics )&amp;nbsp;&lt;/P&gt;
&lt;P&gt;example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;on datanode machine ( note - /etc/stab is already configured )&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;mount /grid/data1&lt;/P&gt;
&lt;P&gt;mount /grid/data2&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;in order to complete the picture , need to restart&amp;nbsp; HDFS and YARN and kafka&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;so - is this scenario could to work ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Nov 2019 06:47:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283068#M210391</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-11-18T06:47:01Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283073#M210396</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What i got from your scenario is on second scratch installation your master nodes [ie. active/standby name-node ] are fresh installed and you are only adding the datanodes which are having pre-existing data [from other cluster]..right!!&lt;/P&gt;&lt;P&gt;-- In this case its not possible to get the cluster up with new data from the HDD which was restored.&lt;/P&gt;&lt;P&gt;Since namenode will not have any information about the blocks lying in blockstorage on the datanode disk.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you have opted support from Cloudera then you can approach them for DR scenario where they can help you to get existing data from datanodes to be added back in cluster[not sure if it can be recovered/added back 100%]&lt;/P&gt;&lt;P&gt;Same for kafka.&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 10:06:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283073#M210396</guid>
      <dc:creator>sagarshimpi</dc:creator>
      <dc:date>2019-11-15T10:06:59Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283077#M210400</link>
      <description>&lt;P&gt;ok&amp;nbsp;&lt;/P&gt;&lt;P&gt;do you can summarize all options to recover the namenode ( also option out from the box )&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 10:40:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283077#M210400</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-11-15T10:40:17Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283080#M210403</link>
      <description>&lt;P&gt;1. if you can backup metadata drom orignal cluster(where datanode were existing at first) and copy that metadata to new cluster then thats the best option.&lt;/P&gt;&lt;P&gt;2. if you are not able to go with point 1, then probably you can try&amp;nbsp;&lt;SPAN&gt;" hadoop namenode -recover" option.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;below link might be useful&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;A href="https://blog.cloudera.com/understanding-hdfs-recovery-processes-part-1/" target="_blank" rel="noopener"&gt;https://blog.cloudera.com/understanding-hdfs-recovery-processes-part-1/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://clouderatemp.wpengine.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/" target="_blank" rel="noopener"&gt;https://clouderatemp.wpengine.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 11:13:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283080#M210403</guid>
      <dc:creator>sagarshimpi</dc:creator>
      <dc:date>2019-11-15T11:13:00Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283086#M210409</link>
      <description>&lt;P&gt;about option one&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I guess you not mean to backup the metadata by copy it as scp or rsync&lt;/P&gt;&lt;P&gt;maybe you means that there are tool for backup like barman for postgresql&lt;/P&gt;&lt;P&gt;so do you know tool for this option?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;on each name nodes we have the following folders&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;/hadoop/hdfs/namenode/current ( when fsimage exists i&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;/hadoop/hdfs/journal/hdfsha/current/&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;do you means to backup only these folders , lets say every&amp;nbsp;day ?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 12:46:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283086#M210409</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-11-15T12:46:40Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283088#M210411</link>
      <description>&lt;UL&gt;&lt;LI&gt;Backup i mean, copy the namenode current directory only&lt;/LI&gt;&lt;LI&gt;first do safemode on and then save namespace. once both commands are executed take backup of namenode current directory from active node.&lt;/LI&gt;&lt;LI&gt;you can copy to destination/new cluster using any command (like scp) or tool. scp sill be simplest option.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 15 Nov 2019 12:53:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283088#M210411</guid>
      <dc:creator>sagarshimpi</dc:creator>
      <dc:date>2019-11-15T12:53:34Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283092#M210420</link>
      <description>&lt;P&gt;since we have both current folders as:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;/hadoop/hdfs/namenode/current ( when fsimage exists i&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;/hadoop/hdfs/journal/hdfsha/current/&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;do you mean to backup both them?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;second&lt;SPAN&gt;&amp;nbsp;how backup for time prescriptive&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;for example one week or more ?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 13:26:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283092#M210420</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2019-11-15T13:26:36Z</dc:date>
    </item>
    <item>
      <title>Re: how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283102#M210428</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp; you just need to backup&amp;nbsp;&lt;SPAN&gt;/hadoop/hdfs/namenode/current from active namenode&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also if you backup one week earlier the activity and lets say your first cluster is going serve more request to clients then you will loose that data which was written after backup.&lt;/P&gt;&lt;P&gt;So best is to do savenamespace and backup when you are going to do activity and freeze clients not accessing the cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2019 14:10:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-recover-HDP-cluster-by-installing-HDP-from-scratch/m-p/283102#M210428</guid>
      <dc:creator>sagarshimpi</dc:creator>
      <dc:date>2019-11-15T14:10:30Z</dc:date>
    </item>
  </channel>
</rss>

