I have an issue on a Yarn cluster. I am able to run my application in local mode on the entry/main node in the cluster but when I am launching it on cluster (using client or cluster mode) it just does not start on the cluster. The error is simply that yarn container is not launching with an exit code of -1. What could be the issue?
Tried many things and surprisingly the configuration on this cluster is same as another independent cluster where the Spark application is running in cluster mode.
Can you share an example of how you are trying to run the job? Is it a java or pyspark?
From the documentation: HDP 2.4 Spark Pi Program
spark-submit --master yarn-client --class <java.class.getting.called> something.jar
spark-submit --master yarn-client something.py
spark-submit --class "com.MyClass" --master yarn --deploy-mode client --driver-memory 4g --executor-memory 4g --num-executors 4 --executor-cores 10 logminer-1.0.jar -arg1 value1
To better help you with this, are you able to run the example program that is listed in the document link I showed up above? I suspect that either you are not able to submit a job because the node you are on does not have a spark client installed or the environment variables are not propperly getting sourced.
One thing that might help is if you first execute the environment shell script: (on RHEL/CentOS)
# sh /usr/hdp/current-version/spark-client/conf/spark-env.sh
Then try running the example like in the docs and see if it succeeds. If it fails, we can look more closely at what the issue is.
No it didn't launch sample applications in cluster mode. I'll try running the script and will let you know.