08-02-2018 08:15 AM - last edited on 08-02-2018 01:42 PM by cjervis
I'm able to submit spark job using both yarn-client and yarn-cluster mode successfully on a kerberized cluster from an edge/client node managed by cloudera manager using both spark on yarn and spark2.
-sh-4.1$ spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode client --master yarn /opt/cloudera/parcels/CDH/lib/spark/lib/spark-examples.jar 1
Pi is roughly 3.137151371513715
I'm now trying to use rest api to submit the job from same edge node but job getting stuck at ACCEPTED state for long time. After verifying yarn node manager logs found this error:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS];
-sh-4.1$ curl --proxy "" --negotiate -u: -s -i -k -X POST https://RM_HOST:8090/ws/v1/cluster/apps/new-application
Generated new application-id successfully
-sh-4.1$ curl --proxy "" --negotiate -u: -s -i -k -v -X POST -d @1.json -H "Content-type: application/json" 'https://RM_HOST:8090/ws/v1/cluster/apps'
I can see job getting triggered and going to ACCEPTED state, after 20 mins job FAILED. I verified yarn_nodemanager_resource_memory_mb property and it set to 8GB. Do I need to provide --keytab and --principal parameters here in json file?
1.json file contents:
"command":"spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode cluster --master yarn /opt/cloudera/parcels/CDH/lib/spark/lib/spark-examples.jar 1"
Appreciate your help.