Created 05-16-2018 06:59 PM
Hi,
After installing HDP 2.6.3, I ran Pyspark in the terminal, then initiated a Spark Session, and tried to create a new database (see last line of code:
$ pyspark > from pyspark.sql import SparkSession > spark = SparkSession.builder.master("local").appName("test").enableHiveSupport().getOrCreate() > spark.sql("show databases").show() > spark.sql("create database if not exists NEW_DB")
However, PySpark threw an error where it was trying to create a database locally:
AnalysisException: 'org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/home/jdoe/spark-warehouse/new_db.db, failed to create database new_db);'
I wasn't trying to create a database locally. I was trying to create a database within Hive. Is there a configuration problem with HDP 2.6.3?
Please advise. Thanks.
Created 05-16-2018 07:22 PM
@John Doe Could you try running on yarn client mode instead of local? I think this will help resolving the problem you have now.
$ pyspark --master yarn from pyspark.sql import SparkSession spark =SparkSession.builder.appName("test").enableHiveSupport().getOrCreate() spark.sql("show databases").show() spark.sql("create database if not exists NEW_DB")
Note: If you comment this post make sure you tag my name. And If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
HTH
Created 05-18-2018 04:26 PM
Felix Albani
I would be glad to mark the answer as helpful, but don't know how to do that.
Created on 05-18-2018 04:38 PM - edited 08-18-2019 12:03 AM
Created 05-18-2018 04:25 PM
@Felix Albani
I would be glad to mark the answer as helpful, but don't know how to do that.