Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Running Spark job on YARN

Running Spark job on YARN

New Contributor

I am trying to utilise all the resources which I have on the cluster to run the Spark job. I have Cloudera Manager installed on all of the nodes. This is the command which I use to submit the job.

   

spark-submit --master yarn
                 --deploy-mode cluster
                 file:///[spark python file]
                 file://[app argument 1]
                 file://[app argument 2]



During the execution I receive following error:

   

diagnostics: Application application_1450777964379_0027 failed 2 times due to AM Container for appattempt_1450777964379_0027_000002 exited with  exitCode: 1



Any ideas how to fix it will be much appreciated.

The machine where Spark is installed is not accessible by WEB UI I tried to download the sources and read little bit more about the exception.

 

JobDescription
0saveAsTextFile at NativeMethodAccessorImpl.java:-2
1 REPLY 1
Highlighted

Re: Running Spark job on YARN

Super Collaborator

Use the

yarn logs -applicationId APP_ID

command to grab the executor logs so you can get some more detail on what is failing.

APP_ID needs to be replaced with the application ID of your application in your example: application_1450777964379_0027

 

Wilfred

Don't have an account?
Coming from Hortonworks? Activate your account here