Support Questions

Find answers, ask questions, and share your expertise

Accessing Hive Metadata

avatar
Explorer

It is always handy to search tables and column in metadata. How do I search for tables in Hive based on th column name?
 
I know we can do via metadata. I logged into MySQL metadata but did not find a way to search.
 
I saw a way to search a column name but could not find a way to relate to table name.
select COLUMN_NAME from COLUMNS_V2 where column_name regexp 'repo' ;
I also searched on TBLS table, but there is no column name.
 
Any help?

1 ACCEPTED SOLUTION

avatar
Contributor

COLUMNS_OLD is a deprecated table where columns used to be stored. Hive might have some information there for some reason. You can use both COLUMNS_OLD or COLUMNS_V2 when searching for your column.

View solution in original post

7 REPLIES 7

avatar
Contributor

Here's the query you can use on the metastore:

 

select TBL_NAME, COLUMN_NAME, TYPE_NAME from TBLS left join COLUMNS_V2 on CD_ID = TBL_ID where COLUMN_NAME like 'column';

 

where 'column' is the column name you're looking for. 

avatar
Explorer

Seems like not all columns are indexed, how do I index the columns?

avatar
Contributor

You can use the following statment:

 

ALTER TABLE TABLE_NAME ADD INDEX (COLUMN_NAME);

 

Is the query slow on your system?

Could you paste the the output of the 'EXPLAIN SELECT ...' of your query? 

avatar
Explorer

Query comes out fast. Not all hive columns / tables are displayed. I know there is a table for a column idMfg but when I search it does not display.

 

Will creating index help here?

avatar
Contributor

Indexing won't work to connect two tables. This is used for speed performance when searching data on tables.

 

I was taking a look at the tables on the metastore, and there are tables like SKEWED_COL_NAMES, PART_COL_PRIVS, etc. Those contain the column name as well. 

 

How is the table you're looking for configured? Is it partitioned? Is it skewed? 

avatar
Explorer

This query is giving me output. what is difference between COLUMNS_OLD and COLUMNS_V2?

 

select TBL_NAME, COLUMN_NAME, TYPE_NAME from TBLS t left join COLUMNS_OLD c on c.SD_ID = t.SD_ID where COLUMN_NAME regexp 'idorder';

avatar
Contributor

COLUMNS_OLD is a deprecated table where columns used to be stored. Hive might have some information there for some reason. You can use both COLUMNS_OLD or COLUMNS_V2 when searching for your column.