Support Questions

wbekker · ‎03-18-2017

When running spark code in Zeppelin via Livy interpreter, I only see a few containers allocated in yarn. What settings do I need to change to make sure I leverage full cluster capacity? I'm using a cluster created by Hortonworks Datacloud on AWS

yvora · ‎03-19-2017

@Ward Bekker, Firstly, find out the correct configuration in spark to occupy a full cluster. You will need to tune num of executors, executor cores, memory, driver memory etc.

References:

https://community.hortonworks.com/questions/56240/spark-num-executors-setting.html

http://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-mem...

After figuring out the correct configs, you can use one of the below approaches to set up zeppelin and livy interpreter.

1) You can set SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh to specify number of executors, executor cores, memory, driver memory etc . ( This config will be applied on all the spark & livy interpreters )

export SPARK_SUBMIT_OPTIONS="--num-executors X --executor-cores Y --executor-memory Z"

2) Set configs in livy interpreter. Open livy interpreter page and add below configs in livy interpreter.

livy.spark.executor.instances X
livy.spark.executor.cores Y
livy.spark.executor.memory Z

View solution in original post

yvora · ‎03-19-2017