Support Questions
Find answers, ask questions, and share your expertise

Unable to execute query from beeline using Hive on Spark- return code 30041; Failed to create Spark client

Unable to execute query from beeline using Hive on Spark- return code 30041; Failed to create Spark client

New Contributor

I just built a new cluster using Cloudera Manager V7.0.3 and I am testing it.  All of the services show "green" in CM, and I have run some tests on HDFS and YARN (both MR and Spark) with no problems.  I am now trying to test Hive.  I connected to Hive using the beeline prompt, and I was able to successfully create a table, but when I try to insert a row, it fails with this error:

Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session b0cb962c-b8d5-4254-9ff8-6c3faedb9d21 (state=42000,code=30041)

However, Spark itself seems to work OK- I did a simple test using the instructions at https://docs.cloudera.com/documentation/enterprise/5-6-x/topics/spark_first.html and that worked fine.  The job ran and I was able to see it in the YARN RM.

I checked the HiveServer2 log and I found these messages there:

20/04/28 23:55:23 ERROR operation.Operation: [HiveServer2-Background-Pool: Thread-110]: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session fa9b5cc1-f7c2-4529-9c3c-3e923820901a
.
.
.

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session fa9b5cc1-f7c2-4529-9c3c-3e923820901a
.
.
.
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
.
.
.

So that made me think that somehow the CLASSPATH that beeline is using is wrong- but I don't know how to fix that.  The Spark classes should be on that machine because I ran my YARN Spark test from there.

I also installed Hive on Tez on this cluster, so I tried to use that, and that DID work.  I connected to Hive on Tez from the beeline prompt, and I was able to insert rows into the table (the one that I created earlier) and also query it.

So Hive itself seems OK; it is only Hive on Spark that is the problem.  Can anyone help?  I have configured everything through CM, so I would prefer to fix it there, but if the only option is to "hack" a configuration file somewhere, I would try that...