SparkConf conf = new SparkConf().setAppName("spark").set("spark.master", "yarn-client");
I packaged a Jar and used spark-submit to run the app
but I got the following error.
Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I opened the URL
and clicked on the log link to see the log, but I got
I attached couple of photos to give a clear idea what's going on with me
thank you for your help.
I tried different command to submit the job
spark-submit --class com.Spark.MainClass /home/Test-0.0.1-SNAPSHOT.jar
spark-submit --class com.Spark.MainClass -master yarn-client /home/Test-0.0.1-SNAPSHOT.jar
You really need more cores. But 2 may work.
spark-submit --class "com.stuff.Class" \ --master yarn --deploy-mode client --driver-memory 1024m --executor-memory 1024m --conf spark.ui.port=4244 MyJar.jar
remove this from your code
KryoSerializer is pretty awesome. It is a faster Java serializer. This will speed up Spark, not related to your issue, but I like to add that to all my Spark projects. When RDDs are in memory they are serialized objects. So a faster, smaller serialization will help with speed and memory.
I have faced this issue numerous times as well:
"“WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources”
The problem was with Dynamic Resource Allocation over allocating. After turning off Dynamic Resource allocation and then specifying number of executors, executor memory, and cores, my jobs were running.
Turn off Dynamic Resource Allocation:
conf = (SparkConf()
sc = SparkContext(conf = conf)
Give values with spark submit (you could also set these in SparkConf as well):
/usr/hdp/184.108.40.206-3485/spark/bin/spark-submit --master yarn --deploy-mode client /home/ec2-user/scripts/validate_employees.py --driver-memory 3g --executor-memory 3g --num-executors 4 --executor-cores 2