Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive on Spark CDH 5.7 - Failed to create spark client

SOLVED Go to solution
Highlighted

Hive on Spark CDH 5.7 - Failed to create spark client

Contributor

I have enabled Spark as the default execution engine on Hive on CDH 5.7 but get the following when I execute a query against Hive from my edge node. Is there anything I need to enable on my client edge node.  I can run the spark-shell and have exported SPARK_HOME.  Also copied Client Config to edge node.  Is there anything else I need to enable/configure?

 

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:125)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1774)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1531)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1311)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72)
at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:111)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:98)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:94)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:63)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:101)
... 27 more
Caused by: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:450)
... 1 more

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive on Spark CDH 5.7 - Failed to create spark client

Contributor

The YARN Container Memory was smaller than the Spark Executor requirement.  I set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead.  Check 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

5 REPLIES 5

Re: Hive on Spark CDH 5.7 - Failed to create spark client

Contributor

The YARN Container Memory was smaller than the Spark Executor requirement.  I set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead.  Check 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

Re: Hive on Spark CDH 5.7 - Failed to create spark client

New Contributor
What are your values for executors? And how have you figured out it was a memory issue?

Thanks

Re: Hive on Spark CDH 5.7 - Failed to create spark client

Contributor

The YARN logs contained errors that complained about the memory deficiencies when I selected the Spark Engine for Hive.  And I noticed the Executor Memory Size + Overhead for Spark defaults was larger than the YARN container memory settings.  Increasing the YARN Container Memory configuration cured the problem or alternatively you could lower the Spark Executor requirements.

Re: Hive on Spark CDH 5.7 - Failed to create spark client

New Contributor

Still not working for me... I have played with multiple parameters, but no scussess. Also, yarn logs do not show anything bad about memory. Any ideas?

Re: Hive on Spark CDH 5.7 - Failed to create spark client

Contributor

When you say it is not working, what issue does it exhibit?  For Hive on Spark you only need set the Execution Engine within Hive from MapReduce to Spark.  You do need to consider Spark memory setting for Executors in the Spark Service and these must correlate to the YARN container memory settings.  Generally I set the following YARN container settings:

 

yarn.nodemanager.resource.memory-mb

yarn.scheduler.maximum-allocation-mb

 

To be the same value but greater than the Spark Executor Memory + Overhead .  Check also for the following similar error in the YARN logs:

 

15/09/17 11:15:09 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2211 MB per container)
Exception in thread "main" java.lang.IllegalArgumentException: Required executor memory (2048+384 MB) is above the max threshold (2211 MB) of this cluster!

 

Regards

Shailesh