Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Knowing size of Hive tables

avatar
New Contributor

Hello all,

 

I want to see the size of each table in Hive residing in multiple databases. There are around 3000 tables, so it is difficult to do it one by one for each table. How can I do it at one go?

 

Regards,

Manu

1 ACCEPTED SOLUTION

avatar
Super Guru

@ManuN Anyway you go about this task, you are going to have to execute the commands against the tables to get sizes.  With a large number of tables this should be a script, program, or process.

 

The common methods are to query the table with hive:

 

-- gives all properties
show tblproperties yourTableName

-- show just the raw data size
show tblproperties yourTableName("rawDataSize")

 

Or the most accurate is to look at the table location in HDFS:

 

hdfs dfs -du -s -h /path/to/table

 

There are also some methods to try and get this data directly from the Hive Metastore, assuming the table is an internal Hive table.

 

 

In the past I have completed this with a basic bash/shell script.   I have also done similar in NiFI and prefer to do it like this without coding.

 

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  

 

Thanks,


Steven @ DFHZ

View solution in original post

1 REPLY 1

avatar
Super Guru

@ManuN Anyway you go about this task, you are going to have to execute the commands against the tables to get sizes.  With a large number of tables this should be a script, program, or process.

 

The common methods are to query the table with hive:

 

-- gives all properties
show tblproperties yourTableName

-- show just the raw data size
show tblproperties yourTableName("rawDataSize")

 

Or the most accurate is to look at the table location in HDFS:

 

hdfs dfs -du -s -h /path/to/table

 

There are also some methods to try and get this data directly from the Hive Metastore, assuming the table is an internal Hive table.

 

 

In the past I have completed this with a basic bash/shell script.   I have also done similar in NiFI and prefer to do it like this without coding.

 

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  

 

Thanks,


Steven @ DFHZ