Current issue I'm having:
* Cloudbreak deployment server stops serving the web UI and becomes unusable after 30-90 minutes of use.
* Restarting cbd ("cbd restart") at the shell prompt may bring the GUI back up some of the time, but the username/password doesn't work, and when I use linux to reset the passwd, I can login to the GUI, but it has lost all of my configurations and clusters and gives an error indicating that the UI cannot connect to CloudBreak.
* I have now scrapped it and started over 3 times (generating 3 new CloudBreak deployment VMs) and had the same result every time.
More details and background:
I downloaded and installed CloudBreak 2.7.1 for Centos7. I used it to generate a cbd-deployment virtual machine (in GCP), following the QuickStart (https://docs.hortonworks.com/HDPDocuments/Cloudbreak/Cloudbreak-2.7.1/content/gcp-quick/index.html#g...).
It successfully ran and spit out a new public IP address for me to log into via https://THE_IP_ADDRESS. I navigated into the Cloudbreak GUI and logged in with my admin user auth. I went through the enitre process of setting up certs and deploying a cluster according to one of the blueprints. Everything worked wonderfully!! So excited!
Then, a half hour later when I tried using the cloudbreak deployment gui again, the server was no longer responding. I used ssh to connect into the linux shell and see what was going on. I found a couple of topics here in the HCC forum, that gave hints for troubleshooting the issue. Here are the two that seemed promising:
I tried a few things from the first article:
cd /var/lib/cloudbreak-deployment cbd ps cbd start
This didn't solve the problem. Reading further, I tried:
This enabled the GUI again, but bad user/passwd auth. I used the linux "passwd" command to reset the password for my admin user, and once again I could get past the login screen. But, as noted above, an error popped up stating that it could not connect to CloudBreak. A red-badge at the upper-right side of the GUI highlighted [CloudBreak 0] as being in a failed state. All of the GUI screens (clusters, blueprints, credentials) showed up blank (no data). All is lost!!! 😉
Not knowing what else to do, I started over. I have checked that SELinux and firewalld are disabled on the VM that cloudbreak generated for me. Initially, SELinux is disabled but firewalld was active (in the generated vm). So, I disabled and stopped the firewalld service and tried to "cbd restart". Now (on my 3rd dead cloudbreak-deploy vm box), I can't seem to revive even the GUI to the point where it will let me login.
Is the deployment vm generated by Cloudbreak 2.7.1 this unstable for everyone? Is Centos7 a bad mix with this version? Any suggestions please! I would love to use Cloudbreak, but I'm losing a little confidence in it's ability to not lose all my data and crash. Lol.
Thanks in advance!
Could you please attach the output of the following to the case:
cd /var/lib/cloudbreak-deployment cbd ps cbd create-bundle
That will contain all the logs necessary to find out what happened, without any sensitive info in them.
Hope this helps!
We are running into the same issue. Also working with support ( they have been awesome). interestingly enough upgrading to 2.7.2 didn't fix the issue.
Sorry for the late response, your observation was right, there was a remote update by Google in the launched instances which resulted in Cloudbreak stopping after around one hour.
The fix is already merged.
Could you please try out 2.7.3-rc.4 version, which already contains the fix?
Hope this helps & sorry for the inconvinience.