Member since
03-06-2017
3
Posts
0
Kudos Received
0
Solutions
04-19-2017
01:34 PM
I found there are bugs in livy 0.2 with this functionality. This looks to be fixed in v 0.4. I'm using hortonworks hdp, which only has livy 0.2 on it. I'm not going to upgrade it myself, as I will be losing the benefits of using hdp then. I will wait for the next release of hdp with livy 0.4. Thanks for your answer
... View more
03-07-2017
05:31 PM
Thanks, I'll give that a try and get back to you
... View more
03-06-2017
03:26 PM
Hi, I need Zeppelin applications to terminate after being idle for a certain duration and have tried a number of things with no success. The applications just stay in "RUNNING" state on YARN resource manager. Can someone suggest anything that I've missed where the applications are closed? I'm using hortonworks with hadoop 2.7.3 with zeppelin,spark 1.6.2 and yarn. I've set the zeppelin spark interpreters to be isolated, so each notebook session uses a different spark context. I have enabled dynamic allocation with the shuffle service and cannot see this having any effect. I've set the dynamic allocation attributes through YARN in the spark custom spark-defaults config, and also set them in the Zeppelin spark interpreter config. Below is a screenshot of the config values in Zeppelin spark interpreter: My spark custom-defaults config is: spark.storage.memoryFraction=0.45 spark.shuffle.spill.compress=false spark.shuffle.spill=true spark.shuffle.service.port=7337 spark.shuffle.service.enabled=true spark.shuffle.memoryFraction=0.75 spark.shuffle.manager=SORT spark.shuffle.consolidateFiles=true spark.shuffle.compress=false spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=5 spark.dynamicAllocation.schedulerBacklogTimeout=5 spark.dynamicAllocation.minExecutors=6 spark.dynamicAllocation.initialExecutors=6 spark.dynamicAllocation.executorIdleTimeout=30 spark.dynamicAllocation.enabled=true My YARN node manager config: yarn.nodemanager.aux-services=mapreduce_shuffle,spark_shuffle YARN yarn-site config: yarn.nodemanager.aux-services.mapreduce_shuffle.class=org.apache.hadoop.mapred.ShuffleHandler yarn.nodemanager.aux-services.spark2_shuffle.class=org.apache.spark.network.yarn.YarnShuffleService yarn.nodemanager.aux-services.spark2_shuffle.classpath={{stack_root}}/${hdp.version}/spark2/aux/* yarn.nodemanager.aux-services.spark_shuffle.classpath={{stack_root}}/${hdp.version}/spark/aux/* (I've checked the shuffle jar file location and it does satisfy this) yarn.nodemanager.aux-services.spark_shuffle.class=org.apache.spark.network.yarn.YarnShuffleService I've also created a YARN queue for all Zeppelin tasks. Screenshot of settings below: Below is a screenshot of how YARN Resource Manager looks after 2 Zeppelin notebooks have been run. I can see the spark interpreter spark.executor.instances has ended up being used (10+1). These applications stay running unless killed or spark interpreter is restarted. How can these be removed when idle? Thank you for any suggestions or guidance. Alistair
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Apache Zeppelin