<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Falcon HDFS mirror distcp options in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132017#M94701</link>
    <description>&lt;P&gt;Alternatively, what are the limitations of out-of-stack version support for Falcon? The snapshot-based replication in Falcon 0.10 provides the ultimate functionality I'm looking for, but am currently running on HDP 2.3 / 2.4.&lt;/P&gt;</description>
    <pubDate>Fri, 02 Sep 2016 06:16:45 GMT</pubDate>
    <dc:creator>kdunn926</dc:creator>
    <dc:date>2016-09-02T06:16:45Z</dc:date>
    <item>
      <title>Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132014#M94698</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;
	I'm working with Falcon using the built-in HDFS mirroring capabilities and would like to enable two distcp options in the workflow XML: the -atomic flag and -strategy flags. Below is my Oozie workflow with these two options commented out, as this approach was unsuccessful. Is there a way to pass these in using a -D option or would the FeedReplicator class need to be modified for this functionality?&lt;/P&gt;&lt;PRE&gt;&amp;lt;workflow-app xmlns='uri:oozie:workflow:0.3' name='falcon-dr-fs-workflow'&amp;gt;
    &amp;lt;start to='dr-replication'/&amp;gt;
    &amp;lt;!-- Replication action --&amp;gt;
    &amp;lt;action name="dr-replication"&amp;gt;
        &amp;lt;java&amp;gt;
            &amp;lt;job-tracker&amp;gt;${jobTracker}&amp;lt;/job-tracker&amp;gt;
            &amp;lt;name-node&amp;gt;${nameNode}&amp;lt;/name-node&amp;gt;
            &amp;lt;configuration&amp;gt;
                &amp;lt;property&amp;gt; &amp;lt;!-- hadoop 2 parameter --&amp;gt;
                    &amp;lt;name&amp;gt;oozie.launcher.mapreduce.job.user.classpath.first&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;true&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;mapred.job.queue.name&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;${queueName}&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;oozie.launcher.mapred.job.priority&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;${jobPriority}&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;oozie.use.system.libpath&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;true&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;oozie.action.sharelib.for.java&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;distcp&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;oozie.launcher.oozie.libpath&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;${wf:conf("falcon.libpath")}&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
                &amp;lt;property&amp;gt;
                    &amp;lt;name&amp;gt;oozie.launcher.mapreduce.job.hdfs-servers&amp;lt;/name&amp;gt;
                    &amp;lt;value&amp;gt;${drSourceClusterFS},${drTargetClusterFS}&amp;lt;/value&amp;gt;
                &amp;lt;/property&amp;gt;
            &amp;lt;/configuration&amp;gt;
            &amp;lt;main-class&amp;gt;org.apache.falcon.replication.FeedReplicator&amp;lt;/main-class&amp;gt;
            &amp;lt;arg&amp;gt;-Dmapred.job.queue.name=${queueName}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-Dmapred.job.priority=${jobPriority}&amp;lt;/arg&amp;gt;

            &amp;lt;!--arg&amp;gt;-atomic&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-strategy&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;dynamic&amp;lt;/arg--&amp;gt;

            &amp;lt;arg&amp;gt;-maxMaps&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${distcpMaxMaps}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-mapBandwidth&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${distcpMapBandwidth}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-sourcePaths&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${drSourceDir}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-targetPath&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${drTargetClusterFS}${drTargetDir}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-falconFeedStorageType&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;FILESYSTEM&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-availabilityFlag&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${availabilityFlag == 'NA' ? "NA" : availabilityFlag}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-counterLogDir&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${logDir}/job-${nominalTime}/${srcClusterName == 'NA' ? '' : srcClusterName}&amp;lt;/arg&amp;gt;
        &amp;lt;/java&amp;gt;
        &amp;lt;ok to="end"/&amp;gt;
        &amp;lt;error to="fail"/&amp;gt;
    &amp;lt;/action&amp;gt;
    &amp;lt;kill name="fail"&amp;gt;
        &amp;lt;message&amp;gt;
            Workflow action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
        &amp;lt;/message&amp;gt;
    &amp;lt;/kill&amp;gt;
    &amp;lt;end name="end"/&amp;gt;
&amp;lt;/workflow-app&amp;gt;
&lt;/PRE&gt;</description>
      <pubDate>Fri, 02 Sep 2016 01:19:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132014#M94698</guid>
      <dc:creator>kdunn926</dc:creator>
      <dc:date>2016-09-02T01:19:05Z</dc:date>
    </item>
    <item>
      <title>Re: Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132015#M94699</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/12246/kdunn926.html" nodeid="12246"&gt;@Kyle Dunn&lt;/A&gt;: Falcon doesn't support those DistCP options and yes that would require a code change.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Sep 2016 01:55:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132015#M94699</guid>
      <dc:creator>sramesh</dc:creator>
      <dc:date>2016-09-02T01:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132016#M94700</link>
      <description>&lt;P&gt;Would you be able to provide an example of what this code change might be similar to in the existing FeedReplicator code? &lt;/P&gt;</description>
      <pubDate>Fri, 02 Sep 2016 01:55:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132016#M94700</guid>
      <dc:creator>kdunn926</dc:creator>
      <dc:date>2016-09-02T01:55:51Z</dc:date>
    </item>
    <item>
      <title>Re: Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132017#M94701</link>
      <description>&lt;P&gt;Alternatively, what are the limitations of out-of-stack version support for Falcon? The snapshot-based replication in Falcon 0.10 provides the ultimate functionality I'm looking for, but am currently running on HDP 2.3 / 2.4.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Sep 2016 06:16:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132017#M94701</guid>
      <dc:creator>kdunn926</dc:creator>
      <dc:date>2016-09-02T06:16:45Z</dc:date>
    </item>
    <item>
      <title>Re: Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132018#M94702</link>
      <description>&lt;P&gt;What limitations are we talking about here? Sorry, I don't understand your question.&lt;/P&gt;&lt;P&gt;If you are asking about DIstCP options supported in HDFS Mirroirng, currently below options are supported&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;maxMaps&lt;/LI&gt;&lt;LI&gt;mapBandwidth&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Below additional options can be supported by using workaround given below:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;overwrite&lt;/LI&gt;&lt;LI&gt;ignoreErrors&lt;/LI&gt;&lt;LI&gt;skipChecksum&lt;/LI&gt;&lt;LI&gt;removeDeletedFiles&lt;/LI&gt;&lt;LI&gt;preserveBlockSize&lt;/LI&gt;&lt;LI&gt;preserveReplicationNumber&lt;/LI&gt;&lt;LI&gt;preservePermission&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Please modify the WF hdfs-replication-workflow.xml as below. After distcpMapBandwidth add below content&lt;/P&gt;&lt;PRE&gt;&amp;lt;arg&amp;gt;-overwrite &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${overwrite}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-ignoreErrors &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${ignoreErrors}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-skipChecksum &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${skipChecksum}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-removeDeletedFiles &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${removeDeletedFiles}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-preserveBlockSize &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${preserveBlockSize}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-preserveReplicationNumber &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${preserveReplicationNumber}&amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;-preservePermission &amp;lt;/arg&amp;gt;
&amp;lt;arg&amp;gt;${preservePermission}&amp;lt;/arg&amp;gt;
&lt;/PRE&gt;&lt;P&gt;Pass below options in hdfs-replication.properties&lt;/P&gt;&lt;PRE&gt;overwrite=false
ignoreErrors=false
skipChecksum=false
removeDeletedFiles=true
preserveBlockSize=true
preserveReplicationNumber=true
preservePermission=true
&lt;/PRE&gt;&lt;P&gt;These will work OOTB as FeedReplicator already has support for this and hence no code change is required. Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 02 Sep 2016 12:17:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132018#M94702</guid>
      <dc:creator>sramesh</dc:creator>
      <dc:date>2016-09-02T12:17:33Z</dc:date>
    </item>
    <item>
      <title>Re: Falcon HDFS mirror distcp options</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132019#M94703</link>
      <description>&lt;PRE&gt;Hi!

After change parameter preserveBlockSize &amp;amp; skipChecksum on target site,  do not see any change in xml file for  task (after recreate task) :

[hdfs@target ~]$ hdfs dfs -ls /apps/falcon/extensions/hdfs-mirroring/retargets/runtime/hdfs-mirroring-workflow.xml
-rwxr-xr-x   2 hdfs users       4943 2017-11-13 22:39 /apps/falcon/extensions/hdfs-mirroring/retargets/runtime/hdfs-mirroring-workflow.xml        &amp;lt;&amp;lt;&amp;lt;  change  this file ( on target size)
[hdfs@target ~]$
t1.xml[hdfs@target ~]$ grep -i preserveBlockSize hdfs-mirroring-workflow.xml
            &amp;lt;arg&amp;gt;-preserveBlockSize&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${preserveBlockSize}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-preserveBlockSize&amp;lt;/arg&amp;gt; &amp;lt;arg&amp;gt;true&amp;lt;/arg&amp;gt;
[hdfs@target ~]$
[hdfs@target ~]$
[hdfs@target ~]$ grep -i skipChecksum hdfs-mirroring-workflow.xml
            &amp;lt;arg&amp;gt;-skipChecksum&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;${skipChecksum}&amp;lt;/arg&amp;gt;
            &amp;lt;arg&amp;gt;-skipChecksum&amp;lt;/arg&amp;gt; &amp;lt;arg&amp;gt;true&amp;lt;/arg&amp;gt;
[hdfs@target ~]$

Please help me.

Where I can find file hdfs-replication.properties ?&lt;/PRE&gt;</description>
      <pubDate>Tue, 14 Nov 2017 15:57:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Falcon-HDFS-mirror-distcp-options/m-p/132019#M94703</guid>
      <dc:creator>Kyivstar</dc:creator>
      <dc:date>2017-11-14T15:57:42Z</dc:date>
    </item>
  </channel>
</rss>

