Support Questions

Find answers, ask questions, and share your expertise

Hive on Spark CDH 5.7 - Failed to create spark client

avatar
Explorer

Hi All,

 

We are getting the error while executing the hive queries with spark engine.

 

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

 

The following properties are set to use spark as the execution engine instead of mapreduce:

set hive.execution.engine=spark;

set spark.executor.memory=2g;

 

I tried changing the following properties  also.

set yarn.scheduler.maximum-allocation-mb=2048;

set yarn.nodemanager.resource.memory-mb=2048;

set spark.executor.cores=4

set spark.executor.memory=4g;

set spark.yarn.executor.memoryOverhead=750

set hive.spark.client.server.connect.timeout=900000ms;

 

1 ACCEPTED SOLUTION

avatar
Explorer

The error was a configuration issue. We need to either lower the executor memory (spark.executor.memory) and executor memory overhead (spark.yarn.executor.memoryOverhead) or increase the maximum memory allocation (yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb)

 

We can refer this link 

 http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

 

We tried changing all the combinations and the following properties gave the best result in our cluster:

set hive.execution.engine=spark;
set spark.executor.memory=4g;
set yarn.nodemanager.resource.memory-mb=12288;
set yarn.scheduler.maximum-allocation-mb=2048;

View solution in original post

4 REPLIES 4

avatar
Super Guru
You will need to check both HS2 log and Spark application log to get the real error message.

"Failed to create spark client" is too generic and it can be anything.

avatar
Explorer

The error was a configuration issue. We need to either lower the executor memory (spark.executor.memory) and executor memory overhead (spark.yarn.executor.memoryOverhead) or increase the maximum memory allocation (yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb)

 

We can refer this link 

 http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

 

We tried changing all the combinations and the following properties gave the best result in our cluster:

set hive.execution.engine=spark;
set spark.executor.memory=4g;
set yarn.nodemanager.resource.memory-mb=12288;
set yarn.scheduler.maximum-allocation-mb=2048;

avatar
Explorer
may i ask do you setup the params just in your beeline sql script? do you need to change the configuration xml for hs2?

avatar
Explorer

Yes. Just in the hql file. Not anything in XML file