Support Questions

Find answers, ask questions, and share your expertise

No Completed Application Found in Spark History Server

avatar
Contributor

Hi

I have installed CDH 5.2.0 on a single node

When launch spark python and after successfully executing simple python sc.parallelize(range(1000)).count()

 

I don’t see the this application in spark history server http://localhost:18088, when open I can only see

Event Log Location : hdfs//localhost:8020/user/spark/applicationHistory

No Completed Application Found

 

Can you please advise me if I have check any configurations or

 

I appreciate any help on this

 

thanks

Pal

1 ACCEPTED SOLUTION

avatar
Master Collaborator

This is an option to spark-submit or pyspark. Look at the Spark docs.

View solution in original post

8 REPLIES 8

avatar
Master Collaborator

Are you running Spark on YARN, or using Spark standalone? if the latter, you won't see any YARN history since it's not using YARN.

avatar
Contributor

Hi Sowen,

Thanks for your reply,

 

Are you running Spark on YARN, or using Spark standalone? if the latter, you won't see any YARN history since it's not using YARN.

 

Yes Spark is running on YARN (MR2 Included) I checked this in Cloudera Manager Web console Spark à Configuration

 

Is that mean I have to configure Spark to user YARN (MR2 Included) according in Cloudera Manager its already OR am I missing something

 

I did a default installation and followed wizard.

 

Can you please advice

 

thanks

Pal

avatar
Master Collaborator

Yes but did you also submit your Spark app to YARN? what is your master for the app?

avatar
Contributor

Hi Sowen,

 

Below are the steps

  1. I launched “pyspark” command shell by executing “/opt/cloudera/parcels/CDH/bin/pyspark”
  2. In pyspark command I executed simple python “sc.parallelize(range(1000)).count()” , this simple one line program successfully in pyspark command shell. Since Apache Spark in CDH5.2 is configured to run on YARN , I was expecting to see the pyspark app in spark history server http://localhost:18088

 

Since its single node installation master and worker are in same node

 

Please advice if I am running this simple python program incorrectly

 

thanks

Pal

avatar
Master Collaborator

Spark defaults to run with a local master IIRC. You should set "--master yarn-client" to actually use YARN. I assume it's not different for pyspark vs spark-shell.

avatar
Contributor

Hi Srowen,

Thanks for the update, I am new to spark, can you please guide where should I set "--master yarn-client" in configuration file or ?

 

thanks

Pal

avatar
Master Collaborator

This is an option to spark-submit or pyspark. Look at the Spark docs.

avatar
Cloudera Employee

Hi Pal,

 

Can you grep for the particular application ID in the folder  /user/spark/applicationHistory to make sure whether the job has been successfully completed or still in .inprogress state?

 

Thanks

AKR