Support Questions

Find answers, ask questions, and share your expertise

Find table's size in Hive metastore (MySQL)

avatar
Contributor

I have some tables in hive and I want to find the size of each table through the metastore (MySQL).

I am trying the below but it returns paratition_params, notification_log, sds etc. My tables are stored in TABLES.TBLS but when I am running the below query with from information_schema.TABLES.TBLS it returns this:

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.TBLS

ORDER BY
      (DATA_LENGTH + INDEX_LENGTH)
    DESC
    LIMIT 0, 200' at line 5

SELECT
  TABLE_NAME AS `Table`,
  ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024) AS `Size (MB)`
FROM
  information_schema.TABLES
ORDER BY
  (DATA_LENGTH + INDEX_LENGTH)
DESC;
1 REPLY 1

avatar
Master Collaborator

@dr If it's a managed table, you could get its size from TABLE_PARAMS table:

e.g.

SELECT a.TBL_NAME AS `TABLE`, b.PARAM_VALUE AS `SIZE` from TABLE_PARAMS b INNER JOIN TBLS a where a.TBL_ID=b.TBL_ID and b.PARAM_KEY='totalSize';

 

You could change the you need it. But, if there are external tables, or the table stats are not generated regularly, then you might not get the correct data.

 

You could get the table size using HDFS file system commands as well:

 

hdfs dfs -du -s -h <path to the table location>

 

This will give you more accurate data.