Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ambari support for Hadoop cluster that is distributed accross multiple machines

Solved Go to solution
Highlighted

Ambari support for Hadoop cluster that is distributed accross multiple machines

Explorer

Hi,

I recently set up Hadoop cluster with 4 nodes on single machine using vagrant and ambari, all within virtualbox residing on single physical machine. Now, I decided to modify the topology: remove 2 data nodes from the former machine and add 2 data nodes on another physical machine running its own virtualbox. I installed two hosts on second machine under virtualbox with vagrant.

On the 'Add Host Wizard', Confirm Hosts step I fail with the error:

"Host checks were skipped on 1 hosts that failed to register."

Looking at the ambari-server log at

/var/log/ambari-server/ambari-server.log

I found the following log error:

INFO:root:BootStrapping hosts ['c7003.ambari.apache.org'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: redhat7 with user 'vagrant' sshKey File /var/run/ambari-server/bootstrap/6/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/6 ambari: c7001.ambari.apache.org; server_port: 8080; ambari version: 2.2.2.0; user_run_as: root INFO:root:Executing parallel bootstrap ERROR:root:ERROR: Bootstrap of host c7003.ambari.apache.org fails because previous action finished with non-zero exit code (1) ERROR MESSAGE: Connection to c7003.ambari.apache.org closed.

which hints that ambari-server is trying to start up c7003 node on its local virtualbox (although it is already running on the different machine). The communication between two machine is SSH passwordless and I checked it is working in both directions.

Does amabari support the cluster that is distributed accross multiple machines?

Thanks Zeev

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Ambari support for Hadoop cluster that is distributed accross multiple machines

Does the Ambari Server see all virtual machines on the other machine, e.g. are they in the same network and is the Ambari server machine able to resolve the hostnames of the other machine?

If so can root from Ambari server machine log into the virtual machines on the other machine wíthout password?

These are a few things that need to happen during registration

View solution in original post

4 REPLIES 4
Highlighted

Re: Ambari support for Hadoop cluster that is distributed accross multiple machines

Does the Ambari Server see all virtual machines on the other machine, e.g. are they in the same network and is the Ambari server machine able to resolve the hostnames of the other machine?

If so can root from Ambari server machine log into the virtual machines on the other machine wíthout password?

These are a few things that need to happen during registration

View solution in original post

Re: Ambari support for Hadoop cluster that is distributed accross multiple machines

Explorer

As I already mentioned 2 machines can communicate with SSH passwordless in both directions using their own rsa public keys. From the other hand the virtual host c7001 (where ambari resides in) on machine 1 can't communicate with virtual host c7003 on machine 2, as they use single insecure_private_key, which in turn is used by vagrant to make internal virtual hosts SSH communication within a virtualbox.

So, I'm kind of uncertain should I use one more private/public key pair for each of virtual hosts separated by machines to establish passwordless SSH communication for them as well as for physical machines.

Then the question arise which one to use in ambari 'Add Host Wizard' -> 'Install Options' -> Provide your SSH private key.

Are there formal guidelines, docs, etc. for ambari that installs Hadoop cluster in distributed maner?

Highlighted

Re: Ambari support for Hadoop cluster that is distributed accross multiple machines

Super Guru

"Does ambari support the cluster that is distributed across multiple machines?" As long as the machines are isolated with correct IP routing and forwarding I don't see this as a blocker. Each essentially will require its own DNS/IP which you will have to configure through virtualbox. Each have same ip can get extremely tricky with port forwarding.

Highlighted

Re: Ambari support for Hadoop cluster that is distributed accross multiple machines

Explorer

Finally it worked.

Needed to install separate key pair for passwordless SSH communication but still used insecure_private_key for registration in ambari. Thanks.

Don't have an account?
Coming from Hortonworks? Activate your account here