Hive tables are split into files, how can we know the size of the each file by using hive shell( query)
The individual file sizes is not stored in the metastore, so there is not a way to query them directly
From within the hive shell you can execute HDFS commands such as
dfs -ls /path/to/table
to see the individual files and their sizes.
If you're interested in the total data size of the table, you can execute:
DESCRIBE FORMATTED table_name;
and look for the table parameter named totalSize.
I want to see the size of all the table in hive residing in multiple databases in. There are around 3000 tables, so it is difficult to do it one by one for each table. How can I do it at one go?
As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
Access the path where these files are stored to find the size of these split files. You can do a DESCRIBE EXTENDED /FORMATTED Tablename to find the exact path of the files.
get hdfs path where hive table files are stored. Use hdfs dfs -du -s -h /hdfs_path to get size in readable format.