Created 06-21-2019 04:13 PM
Trying to create a HDF cluster using cloudbreak but the creation process is stuck at step Infrastructure metadata collection finished.
When checking the cloudbreak logs saw the below:
2019-06-21 15:34:36,443 [reactorDispatcher-57] pollWithTimeout:32 INFO c.s.c.s.PollingService - [owner:e315477d-074c-4ed1-ad47-1367a19b0567] [type:STACK] [id:5] [name:dv-data-ingestion-hdf-dev] [flow:7ed0ed96-2d9b-47c5-8f3d-fa8fd91a5049] [tracking:8454d361-1d3c-4e7b-9b03-c9184450d47a] Polling attempt 35. 2019-06-21 15:34:36,445 [reactorDispatcher-57] checkStatus:25 INFO c.s.c.s.s.f.NginxCertListenerTask - [owner:e315477d-074c-4ed1-ad47-1367a19b0567] [type:STACK] [id:5] [name:dv-data-ingestion-hdf-dev] [flow:7ed0ed96-2d9b-47c5-8f3d-fa8fd91a5049] [tracking:8454d361-1d3c-4e7b-9b03-c9184450d47a] Check if nginx is running on 130.6.234.168:9443.
I checked and the cloudbreak VM is able to connect to the Ambari server and nginx is running on the server.
Nginx error logs shows this:
2019/06/21 15:33:16 [error] 4913#0: *16 connect() failed (111: Connection refused) while connecting to upstream, client: 135.28.27.13, server: , request: "GET /favicon.ico HTTP/1.1", upstream: "http://127.0.0.1:8080/favicon.ico", host: "130.6.234.168", referrer: "https://130.6.234.168/ambari/"
Created 06-24-2019 07:29 AM
Hi @Kartheek Kopparapu! Has the cluster provision succeeded? It takes time for nodes to start and during that time Cloudbreak polls the availability of the node, that can cause temporary error message.
Created 06-24-2019 04:54 PM
The server are up and running but cloudbreak is not able to poll them. The connection from VM to server is working as I was able to SSH to them from the VM. AWS console shows the server are configured and are running properly. But the cloudbreak UI just shows "Create in progress" status.
Created 06-24-2019 05:26 PM
AWS is showing the servers as up and running. I was able to ssh to them from the cloudbreak VM as well but the cloudbreak UI is only showing status as "create in progress".
The last even history message was "Infrastructure metadata collection finished"
I checked cloudbreak logs and there are no errors in it.
I see the below message repeated over and over again.
2019-06-24 15:48:43,800 [scheduledExecutor-1] distributeFlows:166 INFO c.s.c.s.h.HeartbeatService - [owner:spring] [type:springLog] [id:] [name:] [flow:] [tracking:] Active CB nodes: (1)[[CloudbreakNode{uuid='52c98d7e-a196-4ec1-8378-5b8adfc2f83a', lastUpdated=1561391310002}]], failed CB nodes: (0)[[]] 2019-06-24 15:48:52,221 [http-nio-8080-exec-8] <init>:57 INFO c.s.c.f.MDCContextFilter - [owner:spring] [type:springLog] [id:] [name:] [flow:] [tracking:] No trackingId in request. Adding trackingId: 'f4e52ca0-73af-4f94-98a6-16093895beb8' 2019-06-24 15:48:52,224 [http-nio-8080-exec-8] getAllForAutoscale:311 INFO c.s.c.s.StackCommonService - [owner:undefined] [type:Autoscale] [id:] [name:] [flow:] [tracking:f4e52ca0-73af-4f94-98a6-16093895beb8] Get all stack, autoscale authorized only. 2019-06-24 15:49:00,004 [scheduledExecutor-2] cancelInvalidFlows:221 INFO c.s.c.s.h.HeartbeatService - [owner:spring] [type:springLog] [id:] [name:] [flow:] [tracking:] Check if there are termination flows for the following stack ids: [9] 2019-06-24 15:49:02,221 [http-nio-8080-exec-9] <init>:57 INFO c.s.c.f.MDCContextFilter - [owner:spring] [type:springLog] [id:] [name:] [flow:] [tracking:] No trackingId in request. Adding trackingId: 'e335bdd8-7dfb-48ad-a697-fad696d923de' 2019-06-24 15:49:02,223 [http-nio-8080-exec-9] getAllForAutoscale:311 INFO c.s.c.s.StackCommonService - [owner:undefined] [type:Autoscale] [id:] [name:] [flow:] [tracking:e335bdd8-7dfb-48ad-a697-fad696d923de] Get all stack, autoscale authorized only.
Created 06-25-2019 03:33 PM
Those logs are irrelevant, not related to cluster creation (autoscale component checks stacks in cloudbreak periodically).
Created 06-25-2019 03:35 PM
Anyway, you need to wait until cluster creation finishes (or throws some error in case of failure), cluster creation time can vary, depends on the speed of the communication between cloudbreak and cluster, depends on the instance details (cpu, memory), etc.
Created 06-25-2019 03:56 PM
The status of the cluster creation has not changed and it's been like that for over 2 days. I am thinking the issue is the proxy setup but not sure where the issues could be since cloudbreak was able to install on the VM and was able to contact aws to create the servers.
Created 06-26-2019 07:19 AM
I assume this issue is the same as this: https://community.hortonworks.com/questions/248245/cloudbreak-stuck-in-creating-aws-hdf-cluster.html
I suggest to handle this in that question and not continuing the conversation everywhere.