<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: does hdfs dfs -put verifies that the transfer went OK in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40110#M26056</link>
    <description>The HDFS client reads your input and sends packets of data (64k-128k chunks&lt;BR /&gt;at a time) which are sent along with their checksums over the network, and&lt;BR /&gt;the DNs involved in the write verify these continually as they receive&lt;BR /&gt;them, before writing them to disk. This way you wouldn't suffer from&lt;BR /&gt;network corruptions, and what's written onto the HDFS would match precisely&lt;BR /&gt;what the client intended to send.&lt;BR /&gt;</description>
    <pubDate>Sun, 24 Apr 2016 16:57:16 GMT</pubDate>
    <dc:creator>Harsh J</dc:creator>
    <dc:date>2016-04-24T16:57:16Z</dc:date>
    <item>
      <title>does hdfs dfs -put verifies that the transfer went OK?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40104#M26053</link>
      <description>&lt;P&gt;Hello everybody!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sometimes my computer crashes, while the data is transferring to hadoop via the command&lt;/P&gt;&lt;PRE&gt;hdfs dfs -put myfile myfolder&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;My question is: does hdfs verifies that the transfer went OK&lt;/STRONG&gt;?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For instance, by automatically comparing the size of the data on the local drive to the amount of data received?&lt;/P&gt;&lt;P&gt;I am asking these because I am transferring very large files (200 GB) and the transfer take hours. It is not easy for me to check whether the files in hdfs are the correct ones or some partial versions of my local files (due to some interrupted transfer).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Many thanks!&lt;/P&gt;</description>
      <pubDate>Sun, 24 Apr 2016 16:18:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40104#M26053</guid>
      <dc:creator>olaf</dc:creator>
      <dc:date>2016-04-24T16:18:09Z</dc:date>
    </item>
    <item>
      <title>Re: does hdfs dfs -put verifies that the transfer went OK</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40106#M26054</link>
      <description>The -put/-copyFromLocal programs follow a rename-upon-complete approach.&lt;BR /&gt;When the file is uploading, it will be named as "filename._COPYING_" and&lt;BR /&gt;upon closure it will be renamed to "filename". This should help you verify&lt;BR /&gt;which files were not entirely copied.&lt;BR /&gt;&lt;BR /&gt;This feature is active by default but if undesirable, can be switched off&lt;BR /&gt;with the -d flag.&lt;BR /&gt;&lt;BR /&gt;X-Ref:&lt;BR /&gt;&lt;A href="https://github.com/cloudera/hadoop-common/blob/cdh5.7.0-release/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java#L380-L402" target="_blank"&gt;https://github.com/cloudera/hadoop-common/blob/cdh5.7.0-release/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java#L380-L402&lt;/A&gt;</description>
      <pubDate>Sun, 24 Apr 2016 16:44:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40106#M26054</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-04-24T16:44:16Z</dc:date>
    </item>
    <item>
      <title>Re: does hdfs dfs -put verifies that the transfer went OK</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40107#M26055</link>
      <description>excellent! very helpful thanks! just a follow up on the verification thing. It seems to me that, in addition to that, hdfs compares the checksums (of the local vs hdfs copy) to assert that the download is finished. is that correct?</description>
      <pubDate>Sun, 24 Apr 2016 16:54:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40107#M26055</guid>
      <dc:creator>olaf</dc:creator>
      <dc:date>2016-04-24T16:54:27Z</dc:date>
    </item>
    <item>
      <title>Re: does hdfs dfs -put verifies that the transfer went OK</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40110#M26056</link>
      <description>The HDFS client reads your input and sends packets of data (64k-128k chunks&lt;BR /&gt;at a time) which are sent along with their checksums over the network, and&lt;BR /&gt;the DNs involved in the write verify these continually as they receive&lt;BR /&gt;them, before writing them to disk. This way you wouldn't suffer from&lt;BR /&gt;network corruptions, and what's written onto the HDFS would match precisely&lt;BR /&gt;what the client intended to send.&lt;BR /&gt;</description>
      <pubDate>Sun, 24 Apr 2016 16:57:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/does-hdfs-dfs-put-verifies-that-the-transfer-went-OK/m-p/40110#M26056</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-04-24T16:57:16Z</dc:date>
    </item>
  </channel>
</rss>

