- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Connecting to hdfs/creating the sparksession from cdsw
Created on ‎11-14-2018 10:47 AM - edited ‎09-16-2022 06:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Team,
I have installed the cdsw successfully but when I was trying to run the hdfs command/trying create the sparksession from the cdsw terminal then I am getting the below error. Any idea/suggestion what exactly i am missing here from set up of point of view? Thanks in Advance!!!
Error:
hdfs dfs -put data/sample_text_file.txt /tmp clouderamaster.<domain>.com
-put: java.net.UnknownHostException:
clouderamaster.<domain>.com: Is my cdh master server.
cdsw.<domain>.com: is the cdsw master.(from where I am running the hdfs /sparksession from the interactive command prompt/terminal)
Created ‎11-23-2018 03:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created on ‎11-15-2018 08:01 PM - edited ‎11-15-2018 08:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, just to add one more point i am unable to ping to clouderamaster from the cdsw master host in the cdsw terminal. Any suggestion guys what exactly i am missing here?
Created ‎11-17-2018 09:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@cjervis Sorry to tag you on the above query. Will you able to guide me where i am missing on the above issue?
Created ‎11-18-2018 05:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately I do not have experience with CDSW but perhaps @tristanzajonc or @peter_ableda can be of assistance.
Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎11-18-2018 07:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@cjervis Thanks for your quick response.
Created ‎11-19-2018 12:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You seem to have some network connectivity issue. I have seen many variations for this, please check
- that you don't have a firewall between the machines,
- you resolve the hosts with DNS and not /etc/hosts
- DNS can do forward/reverse resolution on your master hostname/ip
I think if you are ok on the above, this should work.
Regards,
Peter
Created on ‎11-19-2018 02:07 AM - edited ‎11-19-2018 03:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@peter_ableda Thanks for your response.
Step 1: I have checked and could find we dont have any firewall between my machines.
Step 2: DNS can do forward/reverse resolution on your master hostname/ip> Yes its working for cdswmaster host where I have installed cdsw(added to the dns entry as below which resolves forward as well as backword resolution)
Step 3: I have added 2 DNS entries as below for the cdswmaster host(cdswmaster.lab.test.com)
*.cdswmaster.lab.test.com IN A IP
cdswmaster.lab.test.com IN A IP
Little confused here do i need to add the dns names as above or do i need to change as cdsw(cdsw.lab.test.com) from cdswmaster as per the documentation.
Step 4: Checked the terminal from cdsw web ui and i am unable to ping to clouderamaster.lab.test.com(where cloudera manager was install). But I am able to ping to the terminal ip from the clouderamaster.lab.test.com.
I believe here some set up should be corrected inorder to ping the clouderamaster host correctly. Please advise
Also, do you see anyother set up i need to correct from my end to get going in cdsw. Sorry for little more demanding on the same as i find cdsw is little more complex interms of the set up.
Created ‎11-19-2018 03:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to make sure that forward/reverse DNS resolution works from the CDSW terminal to host where you have the YARN ResourceManager and HDFS NameNode services. You referred to this as clouderamaster.<domain>.com before.
This issue is not related to the CDSW master DNS resolution, you mentioned that you are using the session terminal, as it works, the CDSW master DNS is configured properly.
Regards,
Peter
Created ‎11-19-2018 08:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@peter_ableda Thanks Peter for your detailed explanation and your valuable time. Will check how I can access the CDH master node(YARN/HDFS Name Node) from my cdsw terminal.Not sure if the HTTP proxy set up will help or not.
Created on ‎11-20-2018 01:59 AM - edited ‎11-20-2018 02:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@peter_ableda One Final Question. In my cdsw terminal the Ip addess was different(pod ip not the cdsw host ip) and that was the reason i am unable to connect to the master hostname(clouderamaster.<domain>.com) but i am able to ping using the master host ip from cdsw terminal.
Also, I have not done hadoop Authentication in the cdsw admin web ui. Do you think its must to have access to the cluster and from where I can find the Principal/username and pwd/keytab to get access to the cluster.
