- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Query to find the count of columns for all tables in impala
- Labels:
-
Apache Hive
-
Apache Impala
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to fetch a count of total columns for a list of individual tables/views from Impala from the same schema.
however i wanted to scan through all the tables from that schema to capture the counts of columns of all the tables in a single query ?
i have already performed a similar exercise from Oracle Exa data ,however since i a new to Impala is there a way to capture all the tables in a single query ?
Exadata query for reference
# Oracle Exadata query i used
select owner, table_name as view_name, count(*) as counts
from dba_tab_cols /*DBA_TABLES_COLUMNS*/
where (owner, table_name) in
(
select owner, view_name
from dba_views /*DBA_VIEWS*/
where 1=1
and owner='DESIRED_SCHEMA_NAME'
)
group by owner ,table_name
order by counts desc;
impala /hive
DESCRIBE schemaName.tableName;
# Impala /Hive
how to find out ? if i need to run a single query check i was following the below
for multiple tables/view how to find out the total column counts ?
Created ‎11-21-2023 06:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In Hive, metadata related to tables and columns is typically stored in the 'hive' database, specifically within the 'TBLS' and 'COLUMNS_V2' tables in the 'metastore' database. It is not recommended for users to query the metadata directly. Instead, users can leverage the 'sys' database tables.
Here is a modified query that utilizes the 'hive' database tables:
USE sys;
-- Get the count of columns for all tables
SELECT
t.tbl_name AS TABLE_NAME,
COUNT(c.column_name) AS COLUMN_COUNT
FROM
tbls t
JOIN
columns_v2 c
ON
t.tbl_id = c.cd_id
GROUP BY
t.tbl_name;
Explanation:
- The 'sys.tbls' table contains information about tables, while the 'sys.columns_v2' table contains information about columns.
- We join these tables on the 'TBL_ID' and 'CD_ID' columns to retrieve information about columns for each table.
- The 'COUNT(c.COLUMN_NAME)' expression calculates the count of columns for each table.
This query provides a list of tables along with the count of columns for each table, using the 'sys' database tables."
