Created 10-11-2017 12:54 PM
Hi,
I am new to Hortonworks and trying to learn it. I have installed HDP-2.6.1.0 virtualbox sandbox and it is running fine. I have a few questions:
1-I am not able to see any link on the http://127.0.0.1:8888/splash2.html. This page gives me just the headings as Ambari, Atlas etc but doesnt have any links on it
2- How can i map the internal ip to the machine ip on my hosts file. I tried changing my Network to Ethernet Bridge, but once i do so, i am not able to open the local host also i.e http://127.0.0.1:8080/
3- How can i connect to hadoop cluster.
4- Also i am trying to connect it through talend and i get the below errors
Namenode:
org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.TimeoutException at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:51)
Resource Manager :
org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: org.talend.designer.hdfsbrowse.exceptions.HadoopServerException: java.util.concurrent.ExecutionException: java.lang.reflect.InvocationTargetException at org.talend.designer.hdfsbrowse.hadoop.service.check.AbstractCheckedServiceProvider.checkService(AbstractCheckedServiceProvider.java:51)
As i am new to and learning Hortonworks, any guidance as to how i can achive these is greatly appreciated
Created 10-11-2017 02:58 PM
Hi @Sonu Singh
(Great name btw! :))
Welcome to the Hortonworks Sandbox, it is a great tool to start learning the Hortonworks platform with. I'll try to answer your questions in order.
1) I wouldn't worry too much about the splash page. If you can hit Ambari (http://127.0.0.1:8080) then you are good. The rest can be done via the web at https://hortonworks.com/tutorials/
2) on my laptop's hosts file, I have an entry as follows: 127.0.0.1 sandbox.hortonworks.com sandbox
When I need to connect to the sandbox, I just use the hostname and haven't had any issues. In fact, you should probably try ssh to your sandbox on port 2222, login as root, and change the Ambari admin password now if you haven't already done so.
ssh -p 2222 root@sandbox.hortonworks.com
(default pw is hadoop)
then as root, run: ambari-admin-password-reset
3) Most of the tutorial work on the Sandbox can be done through Ambari, so logging into Ambari and using the occasional terminal via SSH as shown above should be all you need to connect to the cluster to get started.
4) The Sandbox uses docker internally, in addition to the vmware/virtualbox layer on your machine. This can sometimes making connecting third party tools (such as Talend) to it a bit tricky because of port forwarding issues. The port not only has to be open on your virtualbox layer, but also within the Docker host internally. See this link for a good explaination and guide on opening other ports: https://hortonworks.com/tutorial/sandbox-port-forwarding-guide/section/1/
For Talend specifically, there is a good picture of how to configure the Talend side here:
https://www.talendforge.org/forum/img/members/256670/Capture-decran-2016-03-11-a-12_17_58.png
Original conversation here: https://www.talendforge.org/forum/viewtopic.php?pid=173639
Created 10-11-2017 02:58 PM
Hi @Sonu Singh
(Great name btw! :))
Welcome to the Hortonworks Sandbox, it is a great tool to start learning the Hortonworks platform with. I'll try to answer your questions in order.
1) I wouldn't worry too much about the splash page. If you can hit Ambari (http://127.0.0.1:8080) then you are good. The rest can be done via the web at https://hortonworks.com/tutorials/
2) on my laptop's hosts file, I have an entry as follows: 127.0.0.1 sandbox.hortonworks.com sandbox
When I need to connect to the sandbox, I just use the hostname and haven't had any issues. In fact, you should probably try ssh to your sandbox on port 2222, login as root, and change the Ambari admin password now if you haven't already done so.
ssh -p 2222 root@sandbox.hortonworks.com
(default pw is hadoop)
then as root, run: ambari-admin-password-reset
3) Most of the tutorial work on the Sandbox can be done through Ambari, so logging into Ambari and using the occasional terminal via SSH as shown above should be all you need to connect to the cluster to get started.
4) The Sandbox uses docker internally, in addition to the vmware/virtualbox layer on your machine. This can sometimes making connecting third party tools (such as Talend) to it a bit tricky because of port forwarding issues. The port not only has to be open on your virtualbox layer, but also within the Docker host internally. See this link for a good explaination and guide on opening other ports: https://hortonworks.com/tutorial/sandbox-port-forwarding-guide/section/1/
For Talend specifically, there is a good picture of how to configure the Talend side here:
https://www.talendforge.org/forum/img/members/256670/Capture-decran-2016-03-11-a-12_17_58.png
Original conversation here: https://www.talendforge.org/forum/viewtopic.php?pid=173639
Created 10-11-2017 09:54 PM
Hi @Sonu Sahi,
Thanks for your detailed inputs. I have followed the steps and below are my queries:
1- I am able to login to the Ambari account, but is there any specific reason as to why I am not able to login to Splash Page and what is the use of splash page
2- I modified my host file as per input and tried to ssh using putty, but i get the error 'Unable to open Connection to sandbox. Host does not exist'. Do i need to do something else ?
3- Does this mean, i can only SSH the cluster through the terminal and not through Putty
4- What port do i need to open, as the link provides a well documented step on how to do port forwarding, but as i am new to Hortonworks, please let me know the port that i should open which helps me to resolve the issue i am facing in Talend.
Thanks Again for your inputs.
Created 10-11-2017 10:21 PM
hi @Sonu Singh
You are very welcome.
I'm not sure why in your specific case the splash screen is not loading. I've had that had on one specific version of the sandbox in the past where there was an issue with the page, not sure if that is the csae for you. The goal of that page is to provide a starting point for users to be able to get to Ambari, get to other public links like the hortonworks documentation/tutorials and also access a web SSH tool (127.0.0.1:4200). The splash page itself is not needed though, one can access each of those things individually as well. In fact, I would say that most people go directly to Ambari and not bother with the splash page once they have used the sandbox a couple of times.
Are you using Windows or Mac or linux? On a *nix/Mac machine, /etc/hosts is the file that can map a hostname to an ip address, and where I pasted that snip from. My VirtualBox adapter remains in NAT mode btw. I do also usually create a second network interface for a vbox host only adapter, if it is not already present.
You can ssh to the cluster using whatever tool you like, just be sure to use the correct post (2222 for cluster access, 2122 for the docker host access which is normally not needed).
If a tool is just accessing Hive on the standard port, there is likely no port forwarding required. The default hiveserver2 ports (10000) are usually open by default on the sandbox machines. A connection string of something like: jdbc:hive2://sandbox.hortonworks.com:10000/default should work for the sandbox once your /etc/hosts file hostname to ip mapping issue is sorted
good luck!
Created 10-13-2017 07:35 PM
Hi @Sonu Sahi
I am using Windows 10 and i have also not made any change to my virtualbox adapter and hence it is in NAT mode only.
I am not sure, why i am not able to access the cluster trough SSH using putty. Is there anything that i am missing. I did use the way you mentioned and had port 2222, but still it gives the error. Am i missing something.If possible, can you show me a step on how to achieve this. From my tool i am trying to access the hadoop cluster and i am trying to configure the connection details in he component 'tHDFSConnection', but it is failing and through the errors that i had pasted in my question. Any way to overcome this please.
Created 10-13-2017 07:59 PM
Step 1: Modify the hosts file on your machine and reboot if required.
On Windows 10, the hosts file that should be edited is usually at c:\Windows\System32\Drivers\etc\hosts with a line similar to: 127.0.0.1 sandbox.hortonworks.com
Step 2: Ensure the Sandbox VM is up and running (can you reach Ambari to confirm?)
Step 3: Open a terminal, or putty or whatever SSH tool you desire try SSH'ing to: sandbox.hortonworks.com port 2222 with user root.
Step 4: If Step 3 did not work, try using the WebSSH tool built into the Sandbox: http://sandbox.hortonworks.com:4200 or http://127.0.0.1:4200
I can't think of anything else that would be in the way, I doubt very much that your Windows firewall (if enabled) would block access to something running on the local host but it won't hurt to double-check antivirus or firewall tools to see if they are blocking access.
Maybe attach some screenshots of the VM running, and your VM config to see if something else jumps out at us. I suspect once we figure out why you cannot connect to the Sandbox, it may shed light on the 3rd party tool connections to the HiveServer2 running on the Sandbox as well.
Created 10-13-2017 08:00 PM
I think you have this already, but here's the Sandbox tutorial link with additional detail and screenshots to the steps I've written above: https://hortonworks.com/tutorial/learning-the-ropes-of-the-hortonworks-sandbox/
Created 10-13-2017 09:15 PM
Hi @Sonu Sahi
I am now able to SSH the cluster using the above described steps. Thanks a lot.
Also, i am able to connect to the hdfs directory through the talend component. I used NameNode URI = 'hdfs://sandbox.hortonworks.com:8020/" and user root. However now, when i run the job i get the below error '[ERROR]: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path'
I googled and got that i have to add the variable HADOOP_HOME. But what is the path that i should give to this variable.
Created 10-14-2017 11:36 PM
Hi @Sonu Sahi
Could you please provide help on the HADOOP_HOME configuration variable ?
Created 10-16-2017 02:38 PM
Hi @Sonu Singh
I haven't encountered a HADOOP_HOME environment variable problem on the Sanbox before, but see this post on config for it:
https://community.hortonworks.com/questions/109738/haddops-environment-variables.html