Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Connecting to hdfs/creating the sparksession from cdsw

Solved Go to solution

Re: Connecting to hdfs/creating the sparksession from cdsw

Rising Star

Hi,

 

We have an overlay network on top of your CDSW hosts where the pods are getting their IPs from (100.66.x.x).

 

Based on your description it seems that DNS resolution is not working from inside the container while it works on the host. This can happen when multiple nameservers are configured in /etc/resolv.conf but some of them can't resolve your clouderamaster. You could figure out what nameserver can resolve your host and drop the rest of them or make sure that all nameservers can resolve the clouderamaster. 

I like to use `dig @nameserver clouderamaster.com` command to test these.

 

Regards,

Peter

Re: Connecting to hdfs/creating the sparksession from cdsw

Explorer

@peter.ableda Hi Peter, When we say we need to add the dns entry details of the master host. Are we trying to say we need to add the dns entry of the clouderamaster host dns entry or the cdswmaster dns entry?

 

As of now I have added the dns entry of the cdsw master host. Also, we need to add the xtra dot(.) after the hostname as per the documentation(*.cdsw.lab.test.com./cdsw.lab.test.com.)? Sorry I am little confused with the docs.

Re: Connecting to hdfs/creating the sparksession from cdsw

Rising Star

The original issue you reported was an UnknownHostException on the clouderamaster.

 

hdfs dfs -put data/sample_text_file.txt /tmp clouderamaster.<domain>.com
-put: java.net.UnknownHostException:

 

You need to make sure that this host can be resolved (both forward/reverse) from inside a CDSW session via DNS.

 

As you can start a CDSW session and interact with it, you already configured the DNS entry for the CDSW master properly.

 

Regards,

Peter

Re: Connecting to hdfs/creating the sparksession from cdsw

Explorer

@peter.ableda You need to make sure that this host can be resolved (both forward/reverse) from inside a CDSW session via DNS. Is that means we need add another dns entry for the CDH master host(clouderamaster.lab.test.com) so that it can be accessiable from cdsw master host?

Re: Connecting to hdfs/creating the sparksession from cdsw

Rising Star

Yes.

Highlighted

Re: Connecting to hdfs/creating the sparksession from cdsw

Explorer

@peter.ableda Thanks Peter. Now I am able to submit the spark job from cdsw master. Does cloudera provide the user level isoloation when they access to the cdsw project/content as different user can distrub /edit the same content?

Re: Connecting to hdfs/creating the sparksession from cdsw

Rising Star

We have a collaboration page in the documentation:

https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_collaborate.html

 

We also have a page about Kerberos authentication:

https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_kerberos.html

 

I hope this answers your question.

 

Regards,

Peter

 

Re: Connecting to hdfs/creating the sparksession from cdsw

Explorer

@peter.ableda Thanks Peter for your time and have a grt day!