Support Questions

Find answers, ask questions, and share your expertise

cant use hive on spark engine cannot create client erorr code 30041

avatar
Contributor

108732-hive-cap.pnginsert into abhi values (001, 'myname');

INFO : Compiling command(queryId=hive_20190514173633_81607f73-b383-4c0c-819c-3a2a7c09559d): insert into abhi values (001, 'myname')

INFO : Semantic Analysis Completed (retrial = false)

INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col1, type:int, comment:null), FieldSchema(name:col2, type:string, comment:null)], properties:null)

INFO : Completed compiling command(queryId=hive_20190514173633_81607f73-b383-4c0c-819c-3a2a7c09559d); Time taken: 0.653 seconds

INFO : Executing command(queryId=hive_20190514173633_81607f73-b383-4c0c-819c-3a2a7c09559d): insert into abhi values (001, 'myname')

INFO : Query ID = hive_20190514173633_81607f73-b383-4c0c-819c-3a2a7c09559d

INFO : Total jobs = 1

INFO : Launching Job 1 out of 1

INFO : Starting task [Stage-1:MAPRED] in serial mode

ERROR : FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session 7a817eea-176c-46ba-910e-4eed89d4eb4d

INFO : Completed executing command(queryId=hive_20190514173633_81607f73-b383-4c0c-819c-3a2a7c09559d); Time taken: 0.84 seconds

Error: Error while processing statement: FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session 7a817eea-176c-46ba-910e-4eed89d4eb4d (state=42000,code=30041)

9 REPLIES 9

avatar
Contributor

As running hive on spark engine much faster than running it on tez

avatar
Contributor

Please do accept and shade some light

avatar
Contributor

Still awaiting replies


avatar
New Contributor

Did you solve the problem

avatar

I'm facing same problem

 

 

INFO  : Compiling command(queryId=hive_20190826110448_0f47045d-b2f4-4778-817b-da39d9b65325): 
INSERT into dashboard.top10_divida SELECT * from analysis.total_divida_tb  
order by total_divida DESC
limit 10
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:total_divida_tb.contribuinte, type:varchar(100), comment:null), FieldSchema(name:total_divida_tb.total_divida, type:double, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20190826110448_0f47045d-b2f4-4778-817b-da39d9b65325); Time taken: 1.937 seconds
INFO  : Executing command(queryId=hive_20190826110448_0f47045d-b2f4-4778-817b-da39d9b65325): 
INSERT into dashboard.top10_divida SELECT * from analysis.total_divida_tb  
order by total_divida DESC
limit 10
INFO  : Query ID = hive_20190826110448_0f47045d-b2f4-4778-817b-da39d9b65325
INFO  : Total jobs = 3
INFO  : Launching Job 1 out of 3
INFO  : Starting task [Stage-1:MAPRED] in serial mode
ERROR : FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session 1b4603c3-f2ee-42db-8248-d996710793fc_0: java.lang.RuntimeException: spark-submit process failed with exit code 1 and error ?
INFO  : Completed executing command(queryId=hive_20190826110448_0f47045d-b2f4-4778-817b-da39d9b65325); Time taken: 12.323 seconds

avatar

When you can't submit Hive on Spark queries, you need to review what is in the HiveServer2 logs. From client end (beeline) it is unfortunately not obvious.

In any case you need to make sure that:

- Spark service has been enabled as a dependency in Hive service > Configuration

- Review Spark related settings in Hive service > Configuration

- you have enough resources on the cluster and can submit YARN jobs

 

Do you have error messages from the HS2 logs?

 

Thanks

 Miklos

avatar
New Contributor

I had a similar issue and saw that the spark.yarn.executer.memoryOverhead value was set to a very high value and Cloudera manager picks MB by default but i had put it in bytes . Reverting it back to nil resolved my issue .

avatar
New Contributor

Hello,I meet the same problem today,and I have solve it after many hours.
you can try my ways.This is effective.
Append this sentence in the spark/conf/spark-env.sh.

export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)

And the /usr/local/hadoop/ is my $HADOOP_HOME.

avatar

I would not advise to do this - only if you have no other options and you are sure there are classpath problems (this suggests you had that - likely not having set up the Spark service dependency with Hive).

Always check the HS2 logs first what is your problem.

With including all the hadoop jars to SPARK_DIST_CLASSPATH you submit and upload lots of jars to the container classpath unnecessarily, which will slow down the job submission time.