Created 09-18-2020 01:00 AM
Hi all
I'm developing an application in IntelliJ and try to get a SparkSession that is running on our CDH cluster.
The code I'm using to create the SparkSession:
val spark: SparkSession = SparkSession.builder()
.master("yarn")
.config("deploy-mode", "cluster")
.appName("someName")
.config("spark.yarn.keytab","c:\\path\\to\\my.keytab")
.config("spark.yarn.principal","myprincipal@DOMAIN.COM")
.config("spark.sql.catalogImplementation", "hive")
.enableHiveSupport()
.getOrCreate()
HADOOP_CONF_DIR and YARN_CONF_DIR are set. The application starts and uploads to the yarn cluster (so Kerberos authentication seems to work). But the application fails with the following error:
Exception in thread "main" java.lang.ClassCastException: org.apache.hadoop.conf.Configuration cannot be cast to org.apache.hadoop.yarn.conf.YarnConfiguration
Not sure which Configuration that is and why I have a hadoop.conf.Configuration instead of hadoop.yarn.conf.YarnConfiguration instance.
I know that I should probably use spark-submit at one point, but I would really like to get it running in IntelliJ during development.
Thanks