I am consistently running into the above error using Cloudbreak (on GCP). I have tried between 5 and 10 times. Whether I configure thru the UI or command line, I always time out after 30 mins on the ‘Bootstrapping infrastructure cluster’ step and end up with a message as above. The specific instance group may change. The example I am seeking help on referenced “instance group host_group_master_3”.
Have searched Google and Hortonworks to no avail. Can anybody suggest what I need to do to get past this? I am zipping the log file and the Event History and attaching here.
BTW here is the Cloudbreak shell code that fails when using command line. This was copied from https://github.com/sequenceiq/cloudbreak/tree/master/shell except for the credential line as the credential for "tmr-adm3" was previously created in UI:
credential select --name tmr-adm3
blueprint select --name hdp-small-default
network select --name default-gcp-network
securitygroup select --name all-services-port
instancegroup configure --instanceGroup cbgateway --nodecount 1 --templateName minviable-gcp
instancegroup configure --instanceGroup host_group_client_1 --nodecount 1 --templateName minviable-gcp
instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName minviable-gcp
instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName minviable-gcp
instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName minviable-gcp
instancegroup configure --instanceGroup host_group_slave_1 --nodecount 1 --templateName minviable-gcp
hostgroup configure --hostgroup host_group_client_1
hostgroup configure --hostgroup host_group_master_1
hostgroup configure --hostgroup host_group_master_2
hostgroup configure --hostgroup host_group_master_3
hostgroup configure --hostgroup host_group_slave_1
stack create --name wikistk --region US_CENTRAL1_B
Could you please check your quotas on google cloud? Which instancetype are you using?
Thanks Richard. Checked quotas. Almost all are at 0%, while the largest is only 10% (Firewall rules). Any other ideas?
I mean how much quota do you have like CPUs? The thing is maybe you dont have enough quota to provision a 4 node cluster. (The default google qouta is very limited). Also If you provision a cluster and ssh into the cbgateway machine then could you please share the munchausen logs with us? ('docker ps -a' will show you the container)
@rdoktorics We have tried both the min-viable-GCP VMs and also n1-highmem-16 (16 CPU, 108G ram) as the masters and n1-standard-8 ( 8 CPU, 30G ram) these should be hefty enough shouldn't they?
As to quotas, here is a list of everything that is above 0 in quota. If it is not on this list, that means we have not used any quota on those items.
CPUs us-central 19% 221 of 2,400
Networks 8% 4 of 50
Routes 3% 8 of 300
Snapshots 3% 749 of 25,000
Total persistent disk reserved (GB) us-central 12% 17,215 of 1,000,000
Static IP addresses us-central1 1% 9 of 700
Does this quota level still look like it might be causing the problem? Either way, what would you sugest next?,
@rdoktorics Thanks Richard. I've had this problem regardless of machine size. From the min-viable-GCP defined in the UI to specifying all the masters and the gateway as n-highmem-16 (16-cpu, 104 GB ram) VMs and the slaves as n-standard-8 (8cpu, 30GB). Shouldn't these machines be sufficient?
Which specific quotas are you concerned with? Here is our list of all quotas that are currently above 0. We are using us-central1-b for our HDP research:
|221 of 2,400|
|4 of 50|
|8 of 300|
|749 of 25,000|
|Total persistent disk reserved (GB) us-central1||
|17,215 of 1,000,000|
|Static IP addresses us-central1||
|9 of 700|
@Thom Rogers Are you using the hosted cloudbreak or your own one? We had an issue on our hosted cloudbreak instance and now it has to works fine. Some people mentioned they could deploy clusters after I fixed the problems.
Not using hosted. I tried with both the maually installed Cloudbreak as well as the pre-defined image defined for GCP. Same result - as described in this post.
@rdoktorics What would be your suggestion at this point? With the end goal being standing up a HDP 2.3 cluster on GCP using Cloudbreak.,
@rdoktorics What would be the next steps here, in term of getting a Cloudbreak installed HDP2.3 cluster on GCP?
@Thom Rogers Currently hardly investigating the issue. Could we do a webex tomorrow? my mail is firstname.lastname@example.org