<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Knowing size of Hive tables in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Knowing-size-of-Hive-tables/m-p/301405#M220644</link>
    <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to see the size of each table in Hive residing in multiple databases. There are around 3000 tables, so it is difficult to do it one by one for each table. How can I do it at one go?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Manu&lt;/P&gt;</description>
    <pubDate>Thu, 13 Aug 2020 05:58:06 GMT</pubDate>
    <dc:creator>ManuN</dc:creator>
    <dc:date>2020-08-13T05:58:06Z</dc:date>
    <item>
      <title>Knowing size of Hive tables</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Knowing-size-of-Hive-tables/m-p/301405#M220644</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to see the size of each table in Hive residing in multiple databases. There are around 3000 tables, so it is difficult to do it one by one for each table. How can I do it at one go?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Manu&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 05:58:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Knowing-size-of-Hive-tables/m-p/301405#M220644</guid>
      <dc:creator>ManuN</dc:creator>
      <dc:date>2020-08-13T05:58:06Z</dc:date>
    </item>
    <item>
      <title>Re: Knowing size of Hive tables</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Knowing-size-of-Hive-tables/m-p/301433#M220667</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80473"&gt;@ManuN&lt;/a&gt;&amp;nbsp;Anyway you go about this task, you are going to have to execute the commands against the tables to get sizes. &amp;nbsp;With a large number of tables this should be a script, program, or process.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The common methods are to query the table with hive:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;-- gives all properties
show tblproperties yourTableName

-- show just the raw data size
show tblproperties yourTableName("rawDataSize")&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Or the most accurate is to look at the table location in HDFS:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;hdfs dfs -du -s -h /path/to/table&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There are also some methods to try and get this data directly from the Hive Metastore, assuming the table is an internal Hive table.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the past I have completed this with a basic bash/shell script. &amp;nbsp; I have also done similar in NiFI and prefer to do it like this without coding.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Steven&amp;nbsp;@ DFHZ&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 12:11:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Knowing-size-of-Hive-tables/m-p/301433#M220667</guid>
      <dc:creator>stevenmatison</dc:creator>
      <dc:date>2020-08-13T12:11:04Z</dc:date>
    </item>
  </channel>
</rss>

