<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Distcp vs hdfs cp in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185060#M147167</link>
    <description>&lt;P&gt;All,&lt;/P&gt;&lt;P&gt;Just adding for knowledge gain , if my source is kerberos enabled while target is not , then the command to be executed will be &lt;/P&gt;&lt;P&gt;hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true webhdfs://source-ip webhdfs://target-ip&lt;/P&gt;&lt;P&gt;hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true : this command overrides values present in the hive-site.xml.&lt;/P&gt;&lt;P&gt;thanks,&lt;/P&gt;&lt;P&gt;Rishit Shah&lt;/P&gt;</description>
    <pubDate>Wed, 10 Jan 2018 17:19:50 GMT</pubDate>
    <dc:creator>rishit606</dc:creator>
    <dc:date>2018-01-10T17:19:50Z</dc:date>
    <item>
      <title>Distcp vs hdfs cp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185057#M147164</link>
      <description>&lt;P&gt;Hello All,&lt;/P&gt;&lt;P&gt;I have a requirement where i want to copy files from one hdfs directory to another via oozie in same cluster.&lt;/P&gt;&lt;P&gt;This can be done using oozie discp action or oozie shell action.&lt;/P&gt;&lt;P&gt;Which is a better way to copy files using oozie.&lt;/P&gt;&lt;P&gt;I guess it is similar as asking hdfs -cp vs distcp?&lt;/P&gt;&lt;P&gt;Thanks and Best Regards,&lt;/P&gt;&lt;P&gt;Gagan&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 01:01:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185057#M147164</guid>
      <dc:creator>coolgags</dc:creator>
      <dc:date>2018-01-10T01:01:31Z</dc:date>
    </item>
    <item>
      <title>Re: Distcp vs hdfs cp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185058#M147165</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/44383/coolgags.html" nodeid="44383"&gt;@Gagandeep Singh Chawla&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;The usage depends on your use case. &lt;/P&gt;&lt;P&gt;1)The main disadvantage of fs -cp is that all data has to transit via the machine you issue the command on, depending on the size of data you want to copy the time consumed increases. DistCp is distributed as its name implies, so there is no bottleneck of this kind.&lt;/P&gt;&lt;P&gt;2) distcp runs a MR job behind and cp command just invokes the FileSystem copy command for every file.&lt;/P&gt;&lt;P&gt;3) If there are existing jobs running, then distcp might take time depending memory/resources consumed by already running jobs.In this case cp would be better.&lt;/P&gt;&lt;P&gt;4) Also, distcp will work between 2 clusters.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Aditya&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 01:22:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185058#M147165</guid>
      <dc:creator>asirna</dc:creator>
      <dc:date>2018-01-10T01:22:09Z</dc:date>
    </item>
    <item>
      <title>Re: Distcp vs hdfs cp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185059#M147166</link>
      <description>&lt;P&gt;This is very much the same i researched too. So i go with distcp for my usecase.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 16:35:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185059#M147166</guid>
      <dc:creator>coolgags</dc:creator>
      <dc:date>2018-01-10T16:35:07Z</dc:date>
    </item>
    <item>
      <title>Re: Distcp vs hdfs cp</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185060#M147167</link>
      <description>&lt;P&gt;All,&lt;/P&gt;&lt;P&gt;Just adding for knowledge gain , if my source is kerberos enabled while target is not , then the command to be executed will be &lt;/P&gt;&lt;P&gt;hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true webhdfs://source-ip webhdfs://target-ip&lt;/P&gt;&lt;P&gt;hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true : this command overrides values present in the hive-site.xml.&lt;/P&gt;&lt;P&gt;thanks,&lt;/P&gt;&lt;P&gt;Rishit Shah&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jan 2018 17:19:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Distcp-vs-hdfs-cp/m-p/185060#M147167</guid>
      <dc:creator>rishit606</dc:creator>
      <dc:date>2018-01-10T17:19:50Z</dc:date>
    </item>
  </channel>
</rss>

