Support Questions

Find answers, ask questions, and share your expertise

errors when running spark queries from hive on CDH6

avatar
Explorer

running the following from hive

SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;

INSERT INTO hello_acid partition (load_date='2016-03-01') VALUES (1, 1);

getting this error 

SQL Error [30041] [42000]: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session e73c125d-e40a-49fb-b2e0-28c28ab1ff60_0: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?

 

Hive has Spark configured as  dependency 

 

gocham_0-1667563299411.png

Have this settings in  hive-site.xml

 

<property><name>hive.spark.client.connect.timeout</name><value>30000ms</value></property><property><name>hive.spark.client.server.connect.timeout</name><value>300000ms</value></property>

 

Some Other settings

gocham_1-1667563406046.png

 

Java Configuration Options for HiveServer2

 

-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:MaxPermSize=512M -XX:+UseParNewGC -XX:-UseGCOverheadLimit

 

I do not see any applications in the hadoop web UI.

The logs are below

From what I can tell hive does not recognize where spark libraries are. I believe this should have been addressed by setting hive dependency on Spark. 

 What am I missing? 

Thanks

 

 

 

2022-11-04 02:46:21,069 ERROR org.apache.hive.spark.client.SparkClientImpl: [HiveServer2-Background-Pool: Thread-57]: Error while waiting for Remote Spark Driver to connect back to HiveServer2.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:41) ~[netty-common-4.1.17.Final.jar:4.1.17.Final]
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:103) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:90) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:104) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:100) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:77) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:131) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:132) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6

...

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at org.apache.hive.spark.client.SparkClientImpl$2.run(SparkClientImpl.java:495) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
... 1 more
2022-11-04 02:46:21,088 ERROR org.apache.hadoop.hive.ql.exec.spark.SparkTask: [HiveServer2-Background-Pool: Thread-57]: Failed to execute Spark task "Stage-1"
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session e73c125d-e40a-49fb-b2e0-28c28ab1ff60_0: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.getHiveException(SparkSessionImpl.java:286) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:135) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:132) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]

..

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.lang.RuntimeException: Error while waiting for Remote Spark Driver to connect back to HiveServer2.
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:124) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:90) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:104) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:100) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:77) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:131) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
... 22 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:41) ~[netty-common-4.1.17.Final.jar:4.1.17.Final]
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:103) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:90) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:104) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]

... 22 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:41) ~[netty-common-4.1.17.Final.jar:4.1.17.Final]
at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:103) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:90) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:104) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:100) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:77) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:131) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
... 22 more
Caused by: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
at org.apache.hive.spark.client.SparkClientImpl$2.run(SparkClientImpl.java:495) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
... 1 more

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

quick update.

ran spark shell and got better errors indicating wrong config settings  

$ spark-shell --master yarn
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/11/04 16:59:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/11/04 16:59:49 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (1024 MB), offHeap memory (0) MB, overhead (384 MB), and PySpark memory (0 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:368)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:199)

View solution in original post

1 REPLY 1

avatar
Explorer

quick update.

ran spark shell and got better errors indicating wrong config settings  

$ spark-shell --master yarn
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/11/04 16:59:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/11/04 16:59:49 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (1024 MB), offHeap memory (0) MB, overhead (384 MB), and PySpark memory (0 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:368)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:199)