I have a cluster composed of 3 machines:
- An ambari-server is 16 GB RAM, 8 CPU and 60 GB disk.
- 2 ambari-agents are 8 GB RAM, 8 CPU and 40 GB disk.
I've installed Spark2 on this cluster and I'm trying to submit spark job but it failed. I think that the issue is related to resources, so what are the minimum requirements of resources to get a functionnal Spark cluster?
Are you facing any issue? Can you please share the log of the spark Job Failure?
It is very hard to say the RAM or Disk requirement for any Job, Until we see the Statistics of the current resource consumption.
Because various factors contributes to the same, Like which kind of job are you running and how resource intense that job is actually. How many other components are running on the host where the job is running ? Also it depends on the cluster size...etc.
Following links might give you some idea on tuning and resource optimization part: https://community.hortonworks.com/articles/42803/spark-on-yarn-executor-resource-allocation-optimiz....
I'm trying to run this command ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 512m --executor-memory 1g --executor-cores 1 --queue default examples/jars/spark-examples_2.11-126.96.36.199.6.2.0-205.jar 10 in the directory /usr/hdp/current/spark2-client with spark user.
I got the error log mentioned in "log.png". From log application through resource_manger:8088 i got the errors in my ambari-agent1 and my ambari-agent2 as shown respectively in "log1.png" and "log2.png".
How can i submit spark jobs correctly?
P.S: Sometimes, when i run spark submit, the job is correctly finished but it seems like it submitted to one node. I conclude this fact from application log through resource_manger:8088 as shown in result.png and details.png.
How can i fix this issue?