<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Discp in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Discp/m-p/286341#M212386</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/72637"&gt;@kiranpune&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery and reporting. It expands a list of files and directories into the input to map tasks, each of which will copy a partition of the files specified in the source list that basic description.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But one can use different command-line options when running DISTCP&amp;nbsp; see the &lt;A href="https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html" target="_blank" rel="noopener"&gt;official dictcp documentation&lt;/A&gt; below are a few options for your different use cases.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;OPTIONS&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;-append&lt;/STRONG&gt;: Incremental copy of the file with the same name but different length&lt;BR /&gt;&lt;STRONG&gt;-update&lt;/STRONG&gt;: Overwrite if source and destination differ in size, block size, or checksum&lt;BR /&gt;&lt;STRONG&gt;-overwrite&lt;/STRONG&gt;: Overwrite destination&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;-delete&lt;/STRONG&gt;: Delete the files existing in the &lt;STRONG&gt;destination&lt;/STRONG&gt; but not in the&amp;nbsp;&lt;STRONG&gt;source&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think you can schedule or script a daily copy&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 25 Dec 2019 17:44:52 GMT</pubDate>
    <dc:creator>Shelton</dc:creator>
    <dc:date>2019-12-25T17:44:52Z</dc:date>
    <item>
      <title>Discp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Discp/m-p/286334#M212382</link>
      <description>&lt;P&gt;If you are using distcp command for transferring data from one cluster to another cluster on regular basis in this scenario only new data will be copied on daily basis so how distcp keep tracks on it?&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 14:35:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Discp/m-p/286334#M212382</guid>
      <dc:creator>kiranpune</dc:creator>
      <dc:date>2019-12-26T14:35:25Z</dc:date>
    </item>
    <item>
      <title>Re: Discp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Discp/m-p/286341#M212386</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/72637"&gt;@kiranpune&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;DistCp (distributed copy) is a tool used for large inter/intra-cluster copying. It uses MapReduce to effect its distribution, error handling and recovery and reporting. It expands a list of files and directories into the input to map tasks, each of which will copy a partition of the files specified in the source list that basic description.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But one can use different command-line options when running DISTCP&amp;nbsp; see the &lt;A href="https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html" target="_blank" rel="noopener"&gt;official dictcp documentation&lt;/A&gt; below are a few options for your different use cases.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;OPTIONS&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;-append&lt;/STRONG&gt;: Incremental copy of the file with the same name but different length&lt;BR /&gt;&lt;STRONG&gt;-update&lt;/STRONG&gt;: Overwrite if source and destination differ in size, block size, or checksum&lt;BR /&gt;&lt;STRONG&gt;-overwrite&lt;/STRONG&gt;: Overwrite destination&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;-delete&lt;/STRONG&gt;: Delete the files existing in the &lt;STRONG&gt;destination&lt;/STRONG&gt; but not in the&amp;nbsp;&lt;STRONG&gt;source&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think you can schedule or script a daily copy&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Dec 2019 17:44:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Discp/m-p/286341#M212386</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2019-12-25T17:44:52Z</dc:date>
    </item>
  </channel>
</rss>

