Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to check size of Hive Table

Highlighted

Unable to check size of Hive Table

New Contributor

I've created a table in Hive and loaded data in the same.

Now I need to check the table size of my table.

I'm using the below property to do the same:

show tblproperties yourTableName("rawDataSize")

However, I'm unable to know the table size.

Attaching the output for the same :

horton.png

4 REPLIES 4

Re: Unable to check size of Hive Table

Hi @Prathamesh H!
Instead of using show tblproperties, try to use:

hive> desc formatted salaries;
OK
# col_name            	data_type           	comment             
	 	 
gender              	string              	                    
age                 	int                 	                    
salary              	double              	                    
zip                 	int                 	                    
	 	 
# Detailed Table Information	 	 
Database:           	default             	 
Owner:              	root                	 
CreateTime:         	Wed May 16 15:23:57 UTC 2018	 
LastAccessTime:     	UNKNOWN             	 
Protect Mode:       	None                	 
Retention:          	0                   	 
Location:           	hdfs://Admin-TrainingNS/apps/hive/warehouse/salaries	 
Table Type:         	EXTERNAL_TABLE      	 
Table Parameters:	 	 
	COLUMN_STATS_ACCURATE	{\"BASIC_STATS\":\"true\"}
	EXTERNAL            	TRUE                
	numFiles            	1                   
	numRows             	50                  
	rawDataSize         	732                 
	totalSize           	781                 
	transient_lastDdlTime	1529819960          
	 	 
# Storage Information	 	 
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	 
InputFormat:        	org.apache.hadoop.mapred.TextInputFormat	 
OutputFormat:       	org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat	 
Compressed:         	No                  	 
Num Buckets:        	-1                  	 
Bucket Columns:     	[]                  	 
Sort Columns:       	[]                  	 
Storage Desc Params:	 	 
	field.delim         	,                   
	serialization.format	,                   
Time taken: 0.609 seconds, Fetched: 36 row(s)
hive> exit;
[hive@node3 ~]$ hdfs dfs -ls hdfs://Admin-TrainingNS/apps/hive/warehouse/salaries
Found 1 items
-rwxrwxrwx   3 root root        781 2018-05-16 04:53 hdfs://Admin-TrainingNS/apps/hive/warehouse/salaries/salaries.txt
PS: and it's always a good idea to run
msck repair table <table> --if your table is external
analyze table <table> compute statistics --specially if your table has a lot of inserts

More details on:
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE

Hope this helps!

Re: Unable to check size of Hive Table

New Contributor

I tried doing as you mentioned. But, still I'm unable to see the raw data size. It's blank. Any other method ?

Re: Unable to check size of Hive Table

New Contributor

@Vinicius Higa Murakami My table is partitioned. I was able to find the size of each partitions by the steps mentioned above. However, i need to find the total size of partitions in one go. Is it feasible ?

Re: Unable to check size of Hive Table

Hi @Prathamesh H!
Sorry about my delay, so regarding your issue. Hmm for partitioned table, afaik, you'll have to summarize per partition, unfortunately :(
I just heard around, that it's possible to get the size on Hive 2.0. I'm not sure, i didn't test it.
One last thing, the command that i had sent to you, in this case (for partitioned table) would be:

analyze table <table> partition(col1,col2) compute statistics;
Hope this helps!