On cloudera CDH 5.16.1 with spark2, the Hive-on-Spark query containing WHERE conditions fail with the following error messages in the HiveServer2 log files:
Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session <id>
.
..
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session <id>
..
..
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream"
There is non problem with the same query from impala or Hive-on-Spark query without WHERE conditions