Support Questions

Find answers, ask questions, and share your expertise

Hive-on-Spark: Failed to create spark client

New Contributor

On cloudera CDH 5.16.1 with spark2,  the Hive-on-Spark query containing WHERE conditions fail with the following error messages in the HiveServer2 log files:

Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session <id>
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session <id>
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream"


There is non problem with the same query from impala or   Hive-on-Spark  query without WHERE  conditions 


Expert Contributor


You will need a HDFS and spark gateway role on the node where you are  triggering the job.


The error -- 

"Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream" --  is a hdfs class. Which would lead me to believe that you do not  have gateway role on the node from where you running the command.


Any solution found for this ?


I am having same issue, both HiveServer and Spark Gateway present on same node where executing query