06-13-2018 03:08 AM
working on cloudera manager Version: Cloudera Express 5.14.0
i am running hive-on-spark job is running successfully and return the result but after executing successfully yarn application master doesn't release reesource of that job.
i have set the following property for spark and hive in cloudera manager.
(spark.dynamicAllocation.executorIdleTimeout -> 10).
(spark.dynamicAllocation.schedulerBacklogTimeout -> 1)
(spark.dynamicAllocation.initialExecutors -> 1)
(spark.eventLog.enabled -> true)
(spark.master -> yarn-client)
(spark.dynamicAllocation.enabled -> true).
(spark.dynamicAllocation.minExecutors -> 1)Display Running job While job is completed.
06-13-2018 03:16 AM - edited 06-13-2018 03:17 AM
Hello @maheshdp
Thank you for posting your query with us.
From your description, I understand that your Hive-on-Spark application is not releasing the resources which are allocated
Could you please help us by providing below information ?
1. How exactly you are checking the resource allocation is not released ? (any sort of screenshot etc., would help)
2. How the job is submitted ( any simple repro steps may also helpful)
06-13-2018 03:23 AM - edited 06-13-2018 03:39 AM
I am checking by resource manager web ui *http://master:8088/cluster*
that display all the running and completing job .when i running job with map reduce as execution engine its work perfectly fine.but when running with hive on spark it does not release resource .
Here i am attaching screenshot of yarn manager web ui.
Running job that are already completed . display on yarn manager http://master:8088/cluster
06-13-2018 03:39 AM - edited 06-13-2018 03:42 AM
Hello @maheshdp
Thanks for your update
Here are the steps which I followed, unfortunately I'm not able to see your attachments
1. Opened "hive" shell
2. "set hive.execution.engine=spark;"
3. Created simple table and ran some insert queries into it (which obviously triggers spark-Yarn job)
4. Checked in RM now you should be able to see the job in "RUNNING" state
5. Notice that your consequent queries wont take much time as this Job will do the work for you
Once you exit out of the hive shell the mentioned job with stop with the status "SUCCEEDED"
Could you please check did I missed anything above ?
06-13-2018 03:48 AM - edited 06-13-2018 03:58 AM
Thanks for Reply
I am doing same thing from hue hive shell and run query it display result .but when check in resource manager it running forever and when i again run query form hive shell query start executing but resource manager display
Details for Stage 0 (Attempt 0)
job submit 30 min ago but display job not started yet.
06-13-2018 10:45 PM
Hello @maheshdp
Could you please try lowering the below configurations in HS2
hive.server2.idle.session.timeout
hive.server2.idle.operation.timeout
This should help you.
06-13-2018 11:11 PM
Thanks for reply
I have tried this configuration as well its default value is 6hrs and 12hrs respectively.when i reduce this value to 10 min and 5 min its closed the session of thrift server instead i want to close spark session of job after completion of job. so, can you suggest how to close session of job after completion of spark job on HIVE-ON-SPARK or any patch are available for same.
06-13-2018 11:43 PM
@maheshdp At present long running jobs are by design.
Please refer the below JIRA
https://issues.apache.org/jira/browse/HIVE-14162
06-21-2018 10:51 PM
No its not resolve the problem.
any other suggestion will be apriciated.
thanks for reply