Hi, is it possible to find out the number of rows in a table from the hive metastore?
I don't want to run "select count(*) from <table>" from the hive prompt. Rather I am looking to find that count from the hive metadata.
Appreciate any insights.
Hi @n c
Hcatalog holds metadata of a table details like Schema, Index, Roles, Structure, Bucketing, Partitions keys, Columns, privileges, When the table was created, By whom it was created etc. But It doesn't contain any details about how many records are stored in each table.
As of the HDP 2.6.1 release here is a query that I use to find row counts on a specific partitioned table:
SELECT * FROM hive.PARTITION_PARAMS AS A, hive.PARTITIONS AS B WHERE A.PARAM_KEY='numRows' and A.PART_ID=B.PART_ID and A.PART_ID IN ( SELECT PART_ID FROM hive.PARTITIONS WHERE TBL_ID=(SELECT A.TBL_ID FROM hive.TBLS AS A, hive.DBS AS B WHERE A.DB_ID=B.DB_ID AND B.NAME='DATABASE_NAME' AND A.TBL_NAME='TABLE_NAME'))
This specific query gives me row counts for each partition of a specific table. You need to replace DATABASE_NAME and TABLE_NAME with the one you are interested in.
MySQL metastore but should work with others.