Created 07-19-2016 11:16 AM
Hello,
In our current relational environment of Teradata, Oracle, SQL-Server, we often use the online SQL-based data dictionary facilities (DBC, ALL_TABLES, ALL_TAB_COLUMNS, INFORMATION_SCHEMA, etc) for use in the automation of operational tasks and to search columns for "data of interest" for study from other data marts. How can this be done in Hive? I'm aware of "show tables" and "describe", but need much more power. I see that the metastore, org.apache.hadoop.hive.metastore.api.Table has a rich set of functionality, but I have not found examples for what I want to do. I would like to write some Java that would walk all databases of interest, and generate an output of ALL tables in the databases and their columns, types, etc. Or does HCatalog or some other tool offer this type of functionality as an easier/quicker alternative? If someone could point me to some similar examples or intro material, etc, that would be appreciated very much.
Thanks!
Created 07-20-2016 03:16 AM
@Greg Polanchyck some of the functionality you are looking for is available in WebHCat api. Take a look here.
List the columns in an HCatalog table.
http://
www.myserver.com/templeton/v1/ddl/database/
:db/table/
:table/column
Name |
Description |
Required? |
Default |
---|---|---|---|
:db | The database name | Required | None |
:table | The table name | Required | None |
The standard parameters are also supported.
Name |
Description |
---|---|
columns | A list of column names and types |
database | The database name |
table | The table name |
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/column?user.name=ctdean'
{ "columns": [ { "name": "id", "type": "bigint" }, { "name": "user", "comment": "The user name", "type": "string" }, { "name": "my_p", "type": "string" }, { "name": "my_q", "type": "string" } ], "database": "default", "table": "my_table" }
Created 07-20-2016 03:16 AM
@Greg Polanchyck some of the functionality you are looking for is available in WebHCat api. Take a look here.
List the columns in an HCatalog table.
http://
www.myserver.com/templeton/v1/ddl/database/
:db/table/
:table/column
Name |
Description |
Required? |
Default |
---|---|---|---|
:db | The database name | Required | None |
:table | The table name | Required | None |
The standard parameters are also supported.
Name |
Description |
---|---|
columns | A list of column names and types |
database | The database name |
table | The table name |
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/column?user.name=ctdean'
{ "columns": [ { "name": "id", "type": "bigint" }, { "name": "user", "comment": "The user name", "type": "string" }, { "name": "my_p", "type": "string" }, { "name": "my_q", "type": "string" } ], "database": "default", "table": "my_table" }
Created 04-02-2020 04:46 AM
How to get entire table list in hive(cluster). Not for particular table and database.