Support Questions

Find answers, ask questions, and share your expertise

Why create databases if not exists is throwing exception in spark application?

avatar

I am doing spark with hive. My requirement is to create hive database if not exists and doing the same in code, however spark application is throwing an ERROR RetryingHMSHandler:159 - AlreadyExistsException(message:Database abc already exists). which is quite awkward.

Hive should not throw AlreadyExistsException in case of IF NOT EXISTS .

My Create database statement.

CREATE DATABASE IF NOT EXISTS abc
8 REPLIES 8

avatar

@RAUI What version of HDP and Spark are you using? I tested the same using HDP 2.6.4 on zeppelin and is working fine with spark 2. I run the following code more than one time and it always ended with no errors:

spark.sql("show databases").show 

spark.sql("CREATE DATABASE IF NOT EXISTS abc LOCATION '/user/zeppelin/abc.db'")

+------------+ 
|databaseName|
+------------+ 
|         abc|
|     default|
+------------+ 
res27: org.apache.spark.sql.DataFrame = []

Please provide full error stack and details of spark/hdp version you are using.

Note: Please tag my name if you provide a comment to this post using my name and symbol @

avatar

@Felix Albani, Yeah for first time it's fine can you please try to run the same command again? My Spark version is 2.3.0 from my local windows machine.

avatar

@RAUI I did run it more than once. I edited the previous comment also mentioning the same. No errors even after several executions of same code. I'm using spark 2.2.0 - HDP 2.6.4 - Could you provide the full error stack? Also did you use specific location for the database? Are you running on master yarn or local?

avatar

stack-trace.txt

@Felix Albani , I've configured development environment on my local windows machine. I am not specifying location for database and running code as local. Find attached exception stack trace.

avatar

@Felix Albani, Any updates on this?

avatar

@RAUI I haven't been able to replicate this problem. My suggestion is you find out what other database already exists and on which location.

show databases;
describe formatted abc;

Since you are not specifying the location on your create database statement, the default location may be different and hence the error. Or perhaps there is a permission issue to that location. You can try specifying the location and checking the permissions.

HTH

avatar

@Felix Albani, still i am not getting your point it should not throw exception in case of IF NOT EXISTS. As per my understanding when we say IF NOT EXISTS it should execute the statement silently without throwing any exception in case of database is already exists and that's why we are using IF NOT EXISTS.

My purpose over here is to create the database if not exists otherwise don't create it.

avatar
New Contributor

I see these errors too. I feel like they only started happening after I upgraded from Spark 2.2 to Spark 2.4. HTH someone figure out the actual problem...

,

I sporadically see this problem too. It only started happening after I upgraded from Spark 2.2 to Spark 2.4