Below are the version details
CDP 7.1.8
CM : 7.8.1
HIVE : 3.1.3
Spark 2 Version: 2.4.8
Spark 3 Version: 3.3.0
ISSUE DECRIPTION :
We have table with more than 3 million rows. We are not able to execute conditional QUERY with "WHERE", COUNT, with Spark execution engine in hive
When we set to hive execution engine to spark (set hive.execution.engine=spark) we get the error mentioned below :
QUERY FAILED : SELECT * FROM test_.JOBS__PROJECT WHERE state = 'DONE' LIMIT 10;
ERROR : FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session c-47f2-aceb-22390502b303
Error: Error while compiling statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session d6d96da5-f2bc-47f2-aceb-22390502b303 (state=42000,code=30041)
We are able to execute the same query with execution engine set to tez and also able to execute from spark-shell.
Also to note we are successfully able to execute non-conditional query with Spark execution engine
SUCCESS QUERY : SELECT * FROM test_.JOBS__PROJECT LIMIT 10;