Reply
rio
Explorer
Posts: 48
Registered: ‎04-18-2014
Accepted Solution

Accessing Hive Metadata

It is always handy to search tables and column in metadata. How do I search for tables in Hive based on th column name?
 
I know we can do via metadata. I logged into MySQL metadata but did not find a way to search.
 
I saw a way to search a column name but could not find a way to relate to table name.
select COLUMN_NAME from COLUMNS_V2 where column_name regexp 'repo' ;
I also searched on TBLS table, but there is no column name.
 
Any help?

Cloudera Employee
Posts: 16
Registered: ‎01-27-2015

Re: Accessing Hive Metadata

Here's the query you can use on the metastore:

 

select TBL_NAME, COLUMN_NAME, TYPE_NAME from TBLS left join COLUMNS_V2 on CD_ID = TBL_ID where COLUMN_NAME like 'column';

 

where 'column' is the column name you're looking for. 

rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Re: Accessing Hive Metadata

Seems like not all columns are indexed, how do I index the columns?

Cloudera Employee
Posts: 16
Registered: ‎01-27-2015

Re: Accessing Hive Metadata

You can use the following statment:

 

ALTER TABLE TABLE_NAME ADD INDEX (COLUMN_NAME);

 

Is the query slow on your system?

Could you paste the the output of the 'EXPLAIN SELECT ...' of your query? 

rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Re: Accessing Hive Metadata

Query comes out fast. Not all hive columns / tables are displayed. I know there is a table for a column idMfg but when I search it does not display.

 

Will creating index help here?

Highlighted
Cloudera Employee
Posts: 16
Registered: ‎01-27-2015

Re: Accessing Hive Metadata

Indexing won't work to connect two tables. This is used for speed performance when searching data on tables.

 

I was taking a look at the tables on the metastore, and there are tables like SKEWED_COL_NAMES, PART_COL_PRIVS, etc. Those contain the column name as well. 

 

How is the table you're looking for configured? Is it partitioned? Is it skewed? 

rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Re: Accessing Hive Metadata

This query is giving me output. what is difference between COLUMNS_OLD and COLUMNS_V2?

 

select TBL_NAME, COLUMN_NAME, TYPE_NAME from TBLS t left join COLUMNS_OLD c on c.SD_ID = t.SD_ID where COLUMN_NAME regexp 'idorder';

Cloudera Employee
Posts: 16
Registered: ‎01-27-2015

Re: Accessing Hive Metadata

COLUMNS_OLD is a deprecated table where columns used to be stored. Hive might have some information there for some reason. You can use both COLUMNS_OLD or COLUMNS_V2 when searching for your column.

Announcements