Created on 08-16-2016 11:00 PM - edited 08-19-2019 04:51 AM
Hi all, I'm trying out the new Hortonworks Data Cloud but am encountering an error "Infrastructure creation failed. Reason: Operation timed out. Could not reach ssh connection in time". Any advice on what is causing this issue? Also, is there a means to check / track the progress of the install? The information on the UI is quite limited and it took nearly 4 hours before the job failed. Thanks!
Created 08-18-2016 02:08 PM
Hi @KC
Probably the Cloudbreak that runs on the Control Plane(that machine where your cloud UI is running) could not connect to the newly created AWS instances through SSH. This part of the installation usually takes maximum few minutes.
For more details we should check the logs of Cloudbreak on the Control Plane and the instances that are created on the AWS side if the machines still exists. Could you please check that you can SSH to the provisioned instances?
Br,
Tamas
Created 08-17-2016 10:01 AM
@KC
Please check ssh service started on server or please check firewall if it is blocking the port on server.
Created 08-17-2016 10:20 AM
Hi @Ashnee Sharma the whole installation process is supposed to be automated by Hortonworks Cloud so I'm confused about why there would be any firewall or ssh issues. What specifically should I check for and is there a more detailed log regarding the error?
Created 08-17-2016 02:40 PM
@KC
You can check /var/log/message or try to restart the ssh service with command service ssh start.
Created 08-18-2016 02:08 PM
Hi @KC
Probably the Cloudbreak that runs on the Control Plane(that machine where your cloud UI is running) could not connect to the newly created AWS instances through SSH. This part of the installation usually takes maximum few minutes.
For more details we should check the logs of Cloudbreak on the Control Plane and the instances that are created on the AWS side if the machines still exists. Could you please check that you can SSH to the provisioned instances?
Br,
Tamas
Created 08-19-2016 02:10 AM
Hi @Tamas Bihari, thanks for your assistance!
It seems that the error was caused by me limiting the Remote Access CIDR IP during the setup to my own IP which may have prevented Cloudbreak on the Control Plane from accessing the instances. This though appears to me to be a design flaw. Please correct me if I am mistaken.
Also, the error should be thrown much earlier rather than me having to wait four hours before the job fails. Let me know if this is the right place to highlight these issues or if there is another channel I should post them to.
Best regards, KC
Created 08-19-2016 07:21 AM
Hi @KC
Yes, you are right. The remote access could prevent accessing the instances. Especially in this case the Control Plane that runs the Cloudbreak under the hood could not SSH to the provisioned VMs because it has different IP than yours.
We have a huge timeout for this phase of the installation because in case of huge clusters it takes time to ssh to every instances(Also the boot of instances requires more time.). But you are right, we have to find a solution to identify this kind of situation in a quicker way. I will raise an issue for this.
Br,
Tamas
Created 08-19-2016 12:12 PM
I think by default, the Control Plane must be given access to the cluster when the instances are being created. Otherwise, the Remote Access CIDR IP field is pointless right now. Or there needs to be an option to input multiple CIDR IPs.
Thanks, KC