Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Managed & External table
Labels:
- Labels:
-
Apache Hive
Contributor
Created on ‎09-12-2017 03:34 AM - edited ‎09-16-2022 05:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi...I am not able to get the difference between managed & external table.
I know the difference comes when dropping the table. I don't understand what you mean by the data and metadata is deleted in internal and only metadata is deleted in external tables. Can anyone please explain me, how to check it in backend.
1 ACCEPTED SOLUTION
Super Guru
Created ‎09-17-2017 11:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A hive table consists the following:
1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS
If you delete a managed table, both 1 and 2 will be deleted.
However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.
To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):
SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";
Hope above helps.
1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS
If you delete a managed table, both 1 and 2 will be deleted.
However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.
To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):
SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";
Hope above helps.
1 REPLY 1
Super Guru
Created ‎09-17-2017 11:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A hive table consists the following:
1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS
If you delete a managed table, both 1 and 2 will be deleted.
However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.
To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):
SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";
Hope above helps.
1. metadata info (all table and column definitions and HDFS location)
2. actual HDFS data stored in HDFS
If you delete a managed table, both 1 and 2 will be deleted.
However, if you delete an external table, then only 1 will be deleted, meaning, the table reference will be removed in Hive's backend database (show tables will not return the table and you can't query the table any more). The underlining HDFS file will remain on HDFS path untouched.
To confirm this, you can check where the backend database is stored. If it is mysql, simply login and check the table under TBLS and check if you can query the table (mysql table, not hive table):
SELECT * FROM TBLS WHERE TBL_NAME = "{your_table_name}";
Hope above helps.
