Support Questions

TamilP · ‎10-23-2017

Hi All,

We are getting the error while executing the hive queries with spark engine.

Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

The following properties are set to use spark as the execution engine instead of mapreduce:

set hive.execution.engine=spark;

set spark.executor.memory=2g;

I tried changing the following properties also.

set yarn.scheduler.maximum-allocation-mb=2048;

set yarn.nodemanager.resource.memory-mb=2048;

set spark.executor.cores=4

set spark.executor.memory=4g;

set spark.yarn.executor.memoryOverhead=750

set hive.spark.client.server.connect.timeout=900000ms;

TamilP · ‎11-02-2017

The error was a configuration issue. We need to either lower the executor memory (spark.executor.memory) and executor memory overhead (spark.yarn.executor.memoryOverhead) or increase the maximum memory allocation (yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb)

We can refer this link

http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

We tried changing all the combinations and the following properties gave the best result in our cluster:

set hive.execution.engine=spark;
set spark.executor.memory=4g;
set yarn.nodemanager.resource.memory-mb=12288;
set yarn.scheduler.maximum-allocation-mb=2048;

View solution in original post

EricL · ‎10-26-2017

You will need to check both HS2 log and Spark application log to get the real error message.

"Failed to create spark client" is too generic and it can be anything.

TamilP · ‎11-02-2017

The error was a configuration issue. We need to either lower the executor memory (spark.executor.memory) and executor memory overhead (spark.yarn.executor.memoryOverhead) or increase the maximum memory allocation (yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb)

We can refer this link

http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

We tried changing all the combinations and the following properties gave the best result in our cluster:

set hive.execution.engine=spark;
set spark.executor.memory=4g;
set yarn.nodemanager.resource.memory-mb=12288;
set yarn.scheduler.maximum-allocation-mb=2048;

VictorMa · ‎11-18-2017

may i ask do you setup the params just in your beeline sql script? do you need to change the configuration xml for hs2?

TamilP · ‎11-20-2017

Yes. Just in the hql file. Not anything in XML file

Cloudera Community

Support Questions

Hive on Spark CDH 5.7 - Failed to create spark client