Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Loda data into hive using spark howvere how does spark recognize or identify database name

avatar
Explorer

I have gone through below URL to understand how to load data into HIVE using spark in orc format. I understood how to create table in HIVE using spark howvere I have one question that how would spark identify that in which database this table should be created or if I have same table name in two different HIVE DB in which table spark is going to insert values

I have gone through below URL:

https://hortonworks.com/tutorial/using-hive-with-orc-from-apache-spark/

1 ACCEPTED SOLUTION

avatar

In hive, if you do not specify the database name in your query then it will refer to the default database. The name of the default database itself is 'default'.

So the query in the URL you shared :

hiveContext.sql("create table yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

This will create yahoo_orc_table under default database.

If you want to create it in a specific database say 'hardikdatabase', then you must specify databasename.tablename as shown below (hardikdatabase.yahoo_orc_table):

hiveContext.sql("create table hardikdatabase.yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

This same rule applies when you want to read data from hive. You must specify the database in the same way unless it is the default database.

As always, if this answer helps you, please consider accepting it.

View solution in original post

2 REPLIES 2

avatar

In hive, if you do not specify the database name in your query then it will refer to the default database. The name of the default database itself is 'default'.

So the query in the URL you shared :

hiveContext.sql("create table yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

This will create yahoo_orc_table under default database.

If you want to create it in a specific database say 'hardikdatabase', then you must specify databasename.tablename as shown below (hardikdatabase.yahoo_orc_table):

hiveContext.sql("create table hardikdatabase.yahoo_orc_table (date STRING, open_price FLOAT, high_price FLOAT, low_price FLOAT, close_price FLOAT, volume INT, adj_price FLOAT) stored as orc")

This same rule applies when you want to read data from hive. You must specify the database in the same way unless it is the default database.

As always, if this answer helps you, please consider accepting it.

avatar
Explorer

Thanks , it helped a lot to clear my confusion.