Created on 01-22-2021 04:05 AM - edited 09-16-2022 07:40 AM
On cloudera CDH 5.16.1 with spark2, the Hive-on-Spark query containing WHERE conditions fail with the following error messages in the HiveServer2 log files:
Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session <id>
.
..
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session <id>
..
..
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream"
There is non problem with the same query from impala or Hive-on-Spark query without WHERE conditions
Created 01-25-2021 07:29 AM
You will need a HDFS and spark gateway role on the node where you are triggering the job.
The error --
"Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream" -- is a hdfs class. Which would lead me to believe that you do not have gateway role on the node from where you running the command.
Created 03-09-2023 09:45 AM
Any solution found for this ?
I am having same issue, both HiveServer and Spark Gateway present on same node where executing query