07-24-2017 03:19 AM - edited 07-24-2017 03:24 AM
I'm facing issues when submitting a job/ command run using the workbench.
I followed the cdsw installation guide and did the following step that I could sum up to the following :
Switch to java8 on the cdh cluster and on the cdsw machine
Deploy spark2 using parcel & csd on a cdh cluster
Validate using sparkpi : everything is ok
Setting up cdsw on a dedicated node
Download & configure cdsw
cdsw init run without error
The problem :
I'm able to access the workbench but when I try to run any template, like analysis.R for example, I get the following message after 20 second of task inactivity :
Waiting for Spark configuration...
Have you fully deployed client configuration to your CDSW nodes?
And the task stay idle before getting automaticaly killed.
I looked on the spark history. No job was displayed comming from cdsw or else.
I was wondering if I skipped something relative to the the spark cluster configuration for cdsw
I read the cdsw installation carefully but need hints for gathering additionnal debug information or ways to configure correctly cdsw using spark2.
For information : I tried to copy the spark2 configuration files from the cdh worker nodes to the cdsw node for the files /etc/spark2/conf/spark-defaults.conf and /etc/spark2/conf/spark-env.sh
Unfortunately without any positive change.
Any feedback is welcome.
07-24-2017 09:01 AM
Did you add the dedicated CDSW host to the cluster in CM?
From the documentation:
"Cloudera Data Science Workbench hosts must be added to your CDH cluster as gateway hosts, with gateway roles properly configured."