<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: DR with Falcon: handling changing data; distcp validation; snapshotting in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131148#M27197</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/397/ppruski.html" nodeid="397"&gt;@Piotr Pruski&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I think we have answers here for your 1 or 2 questions. &lt;A href="https://community.hortonworks.com/questions/29645/hdfs-replication-for-dr.html#comment-29776"&gt;link&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 04 May 2016 22:42:05 GMT</pubDate>
    <dc:creator>divakarreddy_a</dc:creator>
    <dc:date>2016-05-04T22:42:05Z</dc:date>
    <item>
      <title>DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131147#M27196</link>
      <description>&lt;DIV&gt;Looking for best practises around DR replication option with Falcon (or Ozzie+distcp..).&lt;/DIV&gt;&lt;OL&gt;&lt;LI&gt;Using either feed based replication or mirror recipe in Falcon (that both leverage distcp to my understanding), how does it handle the situation where clients are still writing, moving, or deleting in the source cluster?&lt;P&gt;The distcp documentation states if another client is still writing to a source file, the copy will likely fail..&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;Does Falcon provide any data validation mechanism that the transfer with distcp was successful?&lt;/LI&gt;&lt;LI&gt;What additional benefit would snapshotting have here? (and does Falcon do this?)&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Wed, 04 May 2016 22:35:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131147#M27196</guid>
      <dc:creator>ppruski</dc:creator>
      <dc:date>2016-05-04T22:35:03Z</dc:date>
    </item>
    <item>
      <title>Re: DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131148#M27197</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/397/ppruski.html" nodeid="397"&gt;@Piotr Pruski&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I think we have answers here for your 1 or 2 questions. &lt;A href="https://community.hortonworks.com/questions/29645/hdfs-replication-for-dr.html#comment-29776"&gt;link&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 04 May 2016 22:42:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131148#M27197</guid>
      <dc:creator>divakarreddy_a</dc:creator>
      <dc:date>2016-05-04T22:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131149#M27198</link>
      <description>&lt;P&gt;@piotr pruski&lt;/P&gt;&lt;P&gt;Nice question. Would help a lot of people in the community.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Mon, 09 May 2016 12:39:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131149#M27198</guid>
      <dc:creator>rbiswas1</dc:creator>
      <dc:date>2016-05-09T12:39:21Z</dc:date>
    </item>
    <item>
      <title>Re: DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131150#M27199</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/397/ppruski.html" nodeid="397"&gt;@Piotr Pruski&lt;/A&gt;:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;As you mentioned Falcon piggy backs on DistCP under the hood to achieve replication. If another client is still writing to a source file, the copy will likely fail&lt;/LI&gt;&lt;LI&gt;If the DistCP job fails then Falcon replication job fails too and status API/command can be used to get the finished status of the replication job. Same in case of success too. Also with &lt;A href="https://issues.apache.org/jira/browse/FALCON-1313"&gt;FALCON-1313&lt;/A&gt; support was added for email based notification for job status for Feeds and mirror recipes.&lt;/LI&gt;&lt;LI&gt;Replication using snapshots is not yet supported in Falcon. This feature is added with &lt;A href="https://issues.apache.org/jira/browse/FALCON-1861"&gt;FALCON-1861&lt;/A&gt;. Additional benefit is performance. It leverages HDFS snapshots which are very cost effective to create ( cost is O(1) excluding inode lookup time).
Once created, it is very efficient to find modifications relative to a snapshot and copy over these
modifications for disaster recovery (DR). This makes it's cost effective.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 May 2016 01:42:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131150#M27199</guid>
      <dc:creator>sramesh</dc:creator>
      <dc:date>2016-05-10T01:42:20Z</dc:date>
    </item>
    <item>
      <title>Re: DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131151#M27200</link>
      <description>&lt;P&gt;To add to Sowmya's response:&lt;/P&gt;&lt;P&gt;If a Falcon Mirror process fails, Falcon will attempt the copy again, so a file momentarily open will be captured on the retry. &lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2016 01:17:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131151#M27200</guid>
      <dc:creator>cnormile</dc:creator>
      <dc:date>2016-05-12T01:17:34Z</dc:date>
    </item>
    <item>
      <title>Re: DR with Falcon: handling changing data; distcp validation; snapshotting</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131152#M27201</link>
      <description>&lt;P&gt;What about support for -overwrite and -update flags in HDFS Mirror (Falcon) ?&lt;/P&gt;</description>
      <pubDate>Sat, 19 Nov 2016 02:53:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DR-with-Falcon-handling-changing-data-distcp-validation/m-p/131152#M27201</guid>
      <dc:creator>ambud_sharma1</dc:creator>
      <dc:date>2016-11-19T02:53:37Z</dc:date>
    </item>
  </channel>
</rss>

