Support Questions
Find answers, ask questions, and share your expertise

Spark installation on CDH 5.2

Highlighted

Spark installation on CDH 5.2

New Contributor

Did a clean installaiton of cdh 5.2 ( on AWS , 7 nodes doing a complete install , accepting all components ) , Spark would only install a history server ( and allow me to add gateway servers ) there seems to be no way to add executor nodes. In CDH 5.1 spark executors were installed on the data nodes by default , I would expect the same behaviour in 5.2. Is there a work around

10 REPLIES 10
Highlighted

Re: Spark installation on CDH 5.2

Hi,

 

In 5.2, Spark on YARN is now the default. This uses YARN NodeManagers for execution and does not require assigning Spark executor nodes.

 

Spark on YARN is strongly preferred over the old Spark, which is why we made this change. If you'd really like to use the old Spark, then you can add it to your cluster after the initial setup, just like how you add any other service to a cluster. You can't add the old standalone Spark in the cluster setup wizard.

 

Thanks,

Darren

Highlighted

Re: Spark installation on CDH 5.2

New Contributor

Hi Darren,

 

 

If Spark on YARN, but I can't find the job status from YARN Manager. No history in YARN manager is generated. Actually the job ran well.

 

URL:

HOST:8088/cluster/apps

 

 

Any ideas? how can we get the status of all Spark jobs ?

 

thanks,

Tod

Highlighted

Re: Spark installation on CDH 5.2

Hi Tod,

Did you try looking at the Spark history server? You can find a link to that URL from the status page of the Spark service in the CM UI.

Thanks,
Darren
Highlighted

Re: Spark installation on CDH 5.2

New Contributor
History Server
  • Event Log Location: hdfs://lalana-490120.slc01.dev.ebayc3.com:8020/user/spark/applicationHistory
No Completed Applications Found

 

 

 

this is my history server page. No records there..

Highlighted

Re: Spark installation on CDH 5.2

How are you running your spark job? Are you sure it's configured to talk to the right spark service?

Is there any chance you have a standalone spark (as opposed to spark on yarn) set up as well, and you're accidentally submitting jobs there?
Highlighted

Re: Spark installation on CDH 5.2

New Contributor

in master machine ran the following spark submit command

/opt/cloudera/parcels/CDH/lib/spark/bin/spark-submit  {JOB_FILE}

 

it works in CDH5.1, but when I re-install to CDH5.2, the spark history gone..

Highlighted

Re: Spark installation on CDH 5.2

Did you remember to upload the spark jar after upgrading CDH?

While your job is running, does it show up in the yarn web UI?

Are there any interesting error messages in the spark history server? In the yarn logs?
Highlighted

Re: Spark installation on CDH 5.2

New Contributor

Not upgrade, I installed CDH5.2 in a clean environment.

 

All Applications
Tools
Cluster MetricsApps Submitted Apps Pending Apps Running Apps Completed Containers Running Memory Used Memory Total Memory Reserved VCores Used VCores Total VCores Reserved Active Nodes Decommissioned Nodes Lost Nodes Unhealthy Nodes Rebooted Nodes
000000 B9.11 GB0 B08041000
User Metrics for dr.whoApps Submitted Apps Pending Apps Running Apps Completed Containers Running Containers Pending Containers Reserved Memory Used Memory Pending Memory Reserved VCores Used VCores Pending VCores Reserved
00000000 B0 B0 B000
Show20406080100entries
Search:
 
IDUserNameApplication TypeQueueStartTimeFinishTimeStateFinalStatusProgressTracking UI
No data available in table
Showing 0 to 0 of 0 entries
FirstPreviousNextLast

 

 

this is YARN UI, no executed or executing jobs here..

 

I will check logs ..

Highlighted

Re: Spark installation on CDH 5.2

New Contributor

post image here.

Untitled.png