Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

cdsw spark2 configuration issue

avatar
New Contributor

Hi,

I'm facing issues when submitting a job/ command run using the workbench.

I followed the cdsw installation guide and did the following step that I could sum up to the following :

 

Switch to java8 on the cdh cluster and on the cdsw machine

Deploy spark2 using parcel & csd on a cdh cluster

Validate using sparkpi : everything is ok

Setting up cdsw on a dedicated node

Download & configure cdsw

cdsw init run without error

 

The problem :

I'm able to access the workbench but when I try to run any template, like analysis.R for example, I get the following message after 20 second of task inactivity :

 

 

Waiting for Spark configuration...
Have you fully deployed client configuration to your CDSW nodes?

 

And the task stay idle before getting automaticaly killed.

I looked on the spark history. No job was displayed comming from cdsw or else.

 

I was wondering if I skipped something relative to the the spark cluster configuration for cdsw

I read the cdsw installation carefully but need hints for gathering additionnal debug information or ways to configure correctly cdsw using spark2.

 

For information : I tried to copy the spark2 configuration files from the cdh worker nodes to the cdsw node for the files /etc/spark2/conf/spark-defaults.conf and /etc/spark2/conf/spark-env.sh

Unfortunately without any positive change.

 

Any feedback is welcome.

Regards

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hi,

 

Did you add the dedicated CDSW host to the cluster in CM? 

 

From the documentation:

"Cloudera Data Science Workbench hosts must be added to your CDH cluster as gateway hosts, with gateway roles properly configured."

https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_install.html#config...

 

Regards,

Peter

View solution in original post

2 REPLIES 2

avatar
Super Collaborator

Hi,

 

Did you add the dedicated CDSW host to the cluster in CM? 

 

From the documentation:

"Cloudera Data Science Workbench hosts must be added to your CDH cluster as gateway hosts, with gateway roles properly configured."

https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_install.html#config...

 

Regards,

Peter

avatar
New Contributor
Indeed, the cdsw host was added to the cdh manager but it just needed a spark gateway deployment

Thanks