Support Questions

shaileshCG · ‎04-16-2016

I have enabled Spark as the default execution engine on Hive on CDH 5.7 but get the following when I execute a query against Hive from my edge node. Is there anything I need to enable on my client edge node. I can run the spark-shell and have exported SPARK_HOME. Also copied Client Config to edge node. Is there anything else I need to enable/configure?

ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:64)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114)
at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:125)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1774)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1531)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1311)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72)
at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:111)
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:98)
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:94)
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:63)
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62)
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:101)
... 27 more
Caused by: java.lang.RuntimeException: Cancel client '478049ac-228c-4abb-8ef3-93157822a0a1'. Error: Child process exited before connecting back
at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179)
at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:450)
... 1 more

shaileshCG · ‎04-17-2016

The YARN Container Memory was smaller than the Spark Executor requirement. I set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead. Check 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

View solution in original post

shaileshCG · ‎04-17-2016

The YARN Container Memory was smaller than the Spark Executor requirement. I set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead. Check 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

dmitry_kniazev · ‎05-11-2016

What are your values for executors? And how have you figured out it was a memory issue?

Thanks

shaileshCG · ‎05-12-2016

The YARN logs contained errors that complained about the memory deficiencies when I selected the Spark Engine for Hive. And I noticed the Executor Memory Size + Overhead for Spark defaults was larger than the YARN container memory settings. Increasing the YARN Container Memory configuration cured the problem or alternatively you could lower the Spark Executor requirements.

dmitry_kniazev · ‎05-14-2016

Still not working for me... I have played with multiple parameters, but no scussess. Also, yarn logs do not show anything bad about memory. Any ideas?

shaileshCG · ‎05-17-2016

When you say it is not working, what issue does it exhibit? For Hive on Spark you only need set the Execution Engine within Hive from MapReduce to Spark. You do need to consider Spark memory setting for Executors in the Spark Service and these must correlate to the YARN container memory settings. Generally I set the following YARN container settings:

yarn.nodemanager.resource.memory-mb

yarn.scheduler.maximum-allocation-mb

To be the same value but greater than the Spark Executor Memory + Overhead . Check also for the following similar error in the YARN logs:

15/09/17 11:15:09 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2211 MB per container)
Exception in thread "main" java.lang.IllegalArgumentException: Required executor memory (2048+384 MB) is above the max threshold (2211 MB) of this cluster!

Regards

Shailesh

Cloudera Community

Support Questions

Hive on Spark CDH 5.7 - Failed to create spark client