Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hive managed & external path Question!

avatar
Contributor

I created a managed table called "questiuon".

However, a path called "/warehouse/tablespace/external/hive/question.db" was created on the hdfs path.

Conversely, creating an external table also creates a path called "/warehouse/tablespace/managed/hive/question.db".

What is the reason?

1 ACCEPTED SOLUTION

avatar
Master Collaborator

@novice_tester when you create a database, it takes the external warehouse directory path from hive.metastore.warehouse.external.dir, and sets that as the database LOCATION for external tables only(when created without setting location clause). Please note, there is another MANAGEDLOCATION field that is blank.  

e.g.

describe database extended question;

+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+
|  db_name  | comment  |                      location                      | managedlocation  | owner_name  | owner_type  | parameters  |
+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+
| question  |          | hdfs://node2.cloudera.com:8020/warehouse/tablespace/external/hive/question.db |                  | hive        | USER        |             |
+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+

 A Managedlocation can be set on a database if you want to use a different location for managed tables other than hive.metastore.warehouse.dir. Otherwise it is kept blank. I hope this helps.

 

To summarize, you do not need to worry about the LOCATION of a database. It can host both managed and external tables. When you create a managed table under the same DB, it will pick the path set in hive.metastore.warehouse.dir, irrespective of the database Location, and when you create an external table without the location clause, it gets stored in the DB LOCATION path.

View solution in original post

4 REPLIES 4

avatar
Master Collaborator

@novice_tester Could you please make it a bit clearer for us? What are the DDLs(create table command) you used to create the managed table, and the external table? 

 

Creating a managed table in any location outside of 'hive.metastore.warehouse.dir' path should prompt the following error:

A managed table's location should be located within managed warehouse root directory or within its database's managedLocationUri.

/warehouse/tablespace/managed/hive/ seems like the warehouse directory for the external tables. So, I doubt creating the managed table picked this location on its own.

 

Could you also share the outputs of the following commands from beeline:

beeline> set hive.metastore.warehouse.dir;
beeline> hive.metastore.warehouse.external.dir;

An external table can be created with the LOCATION clause, and we can set any path w/ it.

Refer to this Cloudera Doc.

avatar
Contributor

DDL Statement:
CREATE DATABASE QUESTION;

There are no special set options or settings that have been touched.
Totally default!

avatar
Master Collaborator

@novice_tester when you create a database, it takes the external warehouse directory path from hive.metastore.warehouse.external.dir, and sets that as the database LOCATION for external tables only(when created without setting location clause). Please note, there is another MANAGEDLOCATION field that is blank.  

e.g.

describe database extended question;

+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+
|  db_name  | comment  |                      location                      | managedlocation  | owner_name  | owner_type  | parameters  |
+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+
| question  |          | hdfs://node2.cloudera.com:8020/warehouse/tablespace/external/hive/question.db |                  | hive        | USER        |             |
+-----------+----------+----------------------------------------------------+------------------+-------------+-------------+-------------+

 A Managedlocation can be set on a database if you want to use a different location for managed tables other than hive.metastore.warehouse.dir. Otherwise it is kept blank. I hope this helps.

 

To summarize, you do not need to worry about the LOCATION of a database. It can host both managed and external tables. When you create a managed table under the same DB, it will pick the path set in hive.metastore.warehouse.dir, irrespective of the database Location, and when you create an external table without the location clause, it gets stored in the DB LOCATION path.

avatar
Community Manager

@novice_tester Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: