I am a not very developer trying to help with a POC with the following configuration I have an 8 node 2.6.5 clustering.
My cluster users are experiencing resource issues, when 2 users run spark through Zeppelin notebook and it clogs the cluster, literally, it consumes 93% of the resources.I have tried running the YARN Utility Script but I think I am getting mixed up. Based on the screenshots attached I am giving the following parameters the script I have hbase is installed,
python yarn-utils.py -c 32 -m 187 -d 7 -k True
This is my reference Hortonworks YARN reference after the script has successfully run I changed the Yarn and Mapred settings according to the script recommendations but I end up with only 11 cores what am I doing wrong?
What's the correct way of running the script taking into account the memory and cores available. How should I configure the spark environment not to use up all the memory or to release once the job is done?
NB. I have also isolated the users as see user isolation/scoped jpg
I just feel I am not doing the right thing
1. Spark Dynamic allocation
I believe your Zeppelin is configured to spawn as many executors as possible for SPARK. Kindly enable Dynamic allocation for Spark in Zeppelin.
2. Yarn Queue User Limit.
Can you also check whats your YARN queue configuration.
You can limit the number of containers that can be used by a given user using user limit factor.