Support Questions

Find answers, ask questions, and share your expertise

CDS table and CD_ID in Metastore DB

avatar
New Contributor

I'm trying to read tables and columns and columns' type from Metastore DB. I checked "COLUMN_V2" table and it doesn't have TBL_ID, but it has CD_ID ! Then, I released that there is a table named CDS having one column which is "CD_ID" and the values are as the same as TBL_ID in TBLS table. I want to know if below query using CD_ID is a reliable query or not.

Select tbls.TBL_NAME, cv2.COLUMN_NAME,cv2.TYPE_NAME from TBLS tbls
JOIN COLUMNS_V2 cv2 on tbls.TBL_ID = cv2.CD_ID

 

And I would be grateful if I can know more about below concerns:

1. What is the purpose of CDS table?

2. why TBL_ID was not enough and CD_ID was needed?

3. What does "CD" stands for?

4. Does CD_ID has the same value as TBL_ID all the time? or there are scenarios that they can be different?

 

1 ACCEPTED SOLUTION

avatar
Guru

@Mas_Jamie 

 

CDS table is for Column Descriptors. CD stands for column descriptor. It is used to decouple columns from tables/partitions to reduce storage needs. tables/partitions are almost 1-1 map to these descriptors… but a table may have more than one Column Descriptor in the case of schema evolution!

View solution in original post

3 REPLIES 3

avatar
Guru

@Mas_Jamie 

 

CDS table is for Column Descriptors. CD stands for column descriptor. It is used to decouple columns from tables/partitions to reduce storage needs. tables/partitions are almost 1-1 map to these descriptors… but a table may have more than one Column Descriptor in the case of schema evolution!

avatar
New Contributor

Thank you @asish , your explanation was really helpful.

 

Is there any books/webpage/article that I can read to know more about Metastore and Hive metadata?

avatar
Guru

@Mas_Jamie apologies. Unfortunately we dont have the document 😞