Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Managed & External table

avatar
Contributor

Hi...I am not able to get the difference between managed & external table.

 

I know the difference comes when dropping the table. I don't understand what you mean by the data and metadata is deleted in internal and only metadata is deleted in external tables. Can anyone please explain me, how to check it in backend.

1 ACCEPTED SOLUTION

avatar
Super Guru
A hive table consists the following:

1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS

If you delete a managed table, both 1 and 2 will be deleted.

However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.

To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):

SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";

Hope above helps.

View solution in original post

1 REPLY 1

avatar
Super Guru
A hive table consists the following:

1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS

If you delete a managed table, both 1 and 2 will be deleted.

However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.

To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):

SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";

Hope above helps.