<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: HDFS Snapshot - Size in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Snapshot-Size/m-p/138701#M35337</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/395/mkataria.html" nodeid="395"&gt;@mkataria&lt;/A&gt; &lt;/P&gt;&lt;P&gt;With HDFS Snapshots there is no actual data copying up front for a new snapshot.  It is simply a pointer to a record in time (point-in-time). So when you first take a snapshot, your HDFS storage usage will stay the same.  It is only when you modify the data that data is copied/written. This follows the Copy on Write (COW) concept.&lt;/P&gt;&lt;P&gt;Please take a look at the below JIRA.  IT contains the discussion that lead to the design and is quite informative.&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/HDFS-2802"&gt;https://issues.apache.org/jira/browse/HDFS-2802&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 21 Jul 2016 03:50:41 GMT</pubDate>
    <dc:creator>egarelnabi</dc:creator>
    <dc:date>2016-07-21T03:50:41Z</dc:date>
    <item>
      <title>HDFS Snapshot - Size</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Snapshot-Size/m-p/138700#M35336</link>
      <description>&lt;P&gt;Hello experts,&lt;/P&gt;&lt;P&gt;I'm trying to understand the total size / block size used by HDFS Snapshot.&lt;/P&gt;&lt;P&gt;I have a dir like /user/x/data and a hdfs ls tells me it has 1.1 TB&lt;/P&gt;&lt;P&gt;So If I take a snapshot of /user/x/data will the snapshot consumes same space and how much block size is used by it.&lt;/P&gt;&lt;P&gt;My earlier output from hdfs dfsadmin -report was 19.6 TB and after taking snapshot it was still same.&lt;/P&gt;&lt;P&gt;If snapshots takes same space as of the source why the report does't changes.&lt;/P&gt;&lt;P&gt;Thanks
Mayank&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:30:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Snapshot-Size/m-p/138700#M35336</guid>
      <dc:creator>mkataria</dc:creator>
      <dc:date>2022-09-16T10:30:51Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Snapshot - Size</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Snapshot-Size/m-p/138701#M35337</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/395/mkataria.html" nodeid="395"&gt;@mkataria&lt;/A&gt; &lt;/P&gt;&lt;P&gt;With HDFS Snapshots there is no actual data copying up front for a new snapshot.  It is simply a pointer to a record in time (point-in-time). So when you first take a snapshot, your HDFS storage usage will stay the same.  It is only when you modify the data that data is copied/written. This follows the Copy on Write (COW) concept.&lt;/P&gt;&lt;P&gt;Please take a look at the below JIRA.  IT contains the discussion that lead to the design and is quite informative.&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/HDFS-2802"&gt;https://issues.apache.org/jira/browse/HDFS-2802&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jul 2016 03:50:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Snapshot-Size/m-p/138701#M35337</guid>
      <dc:creator>egarelnabi</dc:creator>
      <dc:date>2016-07-21T03:50:41Z</dc:date>
    </item>
  </channel>
</rss>

