Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hortonworks Data Cloud Error: Cluster Install Failed - Could not reach ssh connection in time

avatar
Rising Star

Hi all, I'm trying out the new Hortonworks Data Cloud but am encountering an error "Infrastructure creation failed. Reason: Operation timed out. Could not reach ssh connection in time". Any advice on what is causing this issue? Also, is there a means to check / track the progress of the install? The information on the UI is quite limited and it took nearly 4 hours before the job failed. Thanks!

6708-screen-shot-2016-08-17-at-65808-am.png

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @KC

Probably the Cloudbreak that runs on the Control Plane(that machine where your cloud UI is running) could not connect to the newly created AWS instances through SSH. This part of the installation usually takes maximum few minutes.

For more details we should check the logs of Cloudbreak on the Control Plane and the instances that are created on the AWS side if the machines still exists. Could you please check that you can SSH to the provisioned instances?

Br,

Tamas

View solution in original post

7 REPLIES 7

avatar

@KC

Please check ssh service started on server or please check firewall if it is blocking the port on server.

avatar
Rising Star

Hi @Ashnee Sharma the whole installation process is supposed to be automated by Hortonworks Cloud so I'm confused about why there would be any firewall or ssh issues. What specifically should I check for and is there a more detailed log regarding the error?

avatar

@KC

You can check /var/log/message or try to restart the ssh service with command service ssh start.

avatar
Expert Contributor

Hi @KC

Probably the Cloudbreak that runs on the Control Plane(that machine where your cloud UI is running) could not connect to the newly created AWS instances through SSH. This part of the installation usually takes maximum few minutes.

For more details we should check the logs of Cloudbreak on the Control Plane and the instances that are created on the AWS side if the machines still exists. Could you please check that you can SSH to the provisioned instances?

Br,

Tamas

avatar
Rising Star

Hi @Tamas Bihari, thanks for your assistance!

It seems that the error was caused by me limiting the Remote Access CIDR IP during the setup to my own IP which may have prevented Cloudbreak on the Control Plane from accessing the instances. This though appears to me to be a design flaw. Please correct me if I am mistaken.

Also, the error should be thrown much earlier rather than me having to wait four hours before the job fails. Let me know if this is the right place to highlight these issues or if there is another channel I should post them to.

Best regards, KC

avatar
Expert Contributor

Hi @KC

Yes, you are right. The remote access could prevent accessing the instances. Especially in this case the Control Plane that runs the Cloudbreak under the hood could not SSH to the provisioned VMs because it has different IP than yours.

We have a huge timeout for this phase of the installation because in case of huge clusters it takes time to ssh to every instances(Also the boot of instances requires more time.). But you are right, we have to find a solution to identify this kind of situation in a quicker way. I will raise an issue for this.

Br,

Tamas

avatar
Rising Star

Hi @Tamas Bihari

I think by default, the Control Plane must be given access to the cluster when the instances are being created. Otherwise, the Remote Access CIDR IP field is pointless right now. Or there needs to be an option to input multiple CIDR IPs.

Thanks, KC