Support Questions

Find answers, ask questions, and share your expertise

Cloudbreak 2.7, deployment to Azure with no public facing IPs

avatar
Contributor

Hi,

Yesterday I've upgraded CBD from 2.6 to 2.7 and I'm getting a lot of issues.
Weird, given 2.7 is in GA and 2.6 is not.

I'm trying to deploy my HDP cluster to a subnet, which is using Azure AD Domain Services DNS.
On deployment I disabled public IPs as well.

This is what I'm getting from logs, when deploying default blueprint with default options onto my private subnet:

cloudbreak_1   | 2018-06-27 14:34:12,803 [containerBootstrapBuilderExecutor-18] doCall:85 INFO  c.s.c.o.OrchestratorBootstrapRunner - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Calling orchestrator bootstrap: Salt, additional info: SaltBootstrap{sc=com.sequenceiq.cloudbreak.orchestrator.salt.client.SaltConnector@3172c1ff, allGatewayConfigs=[GatewayConfig{connectionAddress='10.251.3.69', publicAddress='10.251.3.69', privateAddress='10.251.3.69', hostname='null', gatewayPort=9443, knoxGatewayEnabled=true, primary=true}], originalTargets=[Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}], targets=[Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}]}
cloudbreak_1   | 2018-06-27 14:34:12,805 [containerBootstrapBuilderExecutor-18] call:55 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Bootstrapping of nodes [0/2]
cloudbreak_1   | 2018-06-27 14:34:12,806 [containerBootstrapBuilderExecutor-18] call:57 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Missing targets for SaltBootstrap: [Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}]
cloudbreak_1   | 2018-06-27 14:34:12,827 [containerBootstrapBuilderExecutor-18] lambda$hostnameVerifier$0:28 INFO  c.s.c.c.CertificateTrustManager - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] verify hostname: 10.251.3.69
cloudbreak_1   | 2018-06-27 14:34:12,849 [containerBootstrapBuilderExecutor-18] action:119 INFO  c.s.c.o.s.c.SaltConnector - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] SaltBoot. SaltAction response: SaltBootResponses{responses=[SaltBootResponse{status='', address='10.251.3.69', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}, SaltBootResponse{status='', address='10.251.3.68', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}, SaltBootResponse{status='', address='10.251.3.69', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}]}
cloudbreak_1   | 2018-06-27 14:34:12,851 [containerBootstrapBuilderExecutor-18] call:64 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] SaltBootstrap responses: SaltBootResponses{responses=[SaltBootResponse{status='', address='10.251.3.69', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}, SaltBootResponse{status='', address='10.251.3.68', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}, SaltBootResponse{status='', address='10.251.3.69', statusCode=500, version='null', errorText='it is expected to have a default domain, but it is empty'}]}
cloudbreak_1   | 2018-06-27 14:34:12,852 [containerBootstrapBuilderExecutor-18] call:67 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Failed to distributed salt run to: 10.251.3.69
cloudbreak_1   | 2018-06-27 14:34:12,853 [containerBootstrapBuilderExecutor-18] call:67 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Failed to distributed salt run to: 10.251.3.68
cloudbreak_1   | 2018-06-27 14:34:12,853 [containerBootstrapBuilderExecutor-18] call:67 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Failed to distributed salt run to: 10.251.3.69
cloudbreak_1   | 2018-06-27 14:34:12,854 [containerBootstrapBuilderExecutor-18] call:75 INFO  c.s.c.o.s.p.SaltBootstrap - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Missing nodes to run saltbootstrap: [Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}]
cloudbreak_1   | 2018-06-27 14:34:12,855 [containerBootstrapBuilderExecutor-18] doCall:111 WARN  c.s.c.o.OrchestratorBootstrapRunner - [owner:15adc8a5-f35f-4e42-b1da-2567ccad0c59] [type:STACK] [id:13] [name:dsbeta01] [flow:9c9705c6-d536-4fb7-8bf1-f1075b5d8ff0] [tracking:] Orchestrator component Salt failed to start, retrying [60/90], error count [60/90]. Elapsed time: 52 ms, Total elapsed time: 593529 ms, Reason: com.sequenceiq.cloudbreak.orchestrator.exception.CloudbreakOrchestratorFailedException: There are missing nodes from saltbootstrap: [Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}], additional info: SaltBootstrap{sc=com.sequenceiq.cloudbreak.orchestrator.salt.client.SaltConnector@3172c1ff, allGatewayConfigs=[GatewayConfig{connectionAddress='10.251.3.69', publicAddress='10.251.3.69', privateAddress='10.251.3.69', hostname='null', gatewayPort=9443, knoxGatewayEnabled=true, primary=true}], originalTargets=[Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}], targets=[Node{privateIp='10.251.3.69', publicIp='10.251.3.69', hostname='', domain='null', hostGroup='master', dataVolumes=null}, Node{privateIp='10.251.3.68', publicIp='10.251.3.68', hostname='', domain='null', hostGroup='worker', dataVolumes=null}]}
cloudbreak_1   | 2018-06-27 14:34:12,867 [http-nio-8080-exec-2] getAllForAutoscale:170 INFO  c.s.c.s.StackCommonService - [owner:undefined] [type:StackV1] [id:] [name:] [flow:] [tracking:] Get all stack, autoscale authorized only.
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @Jakub Igla,

it looks like your virtual machines haven't got an hostname/domain name. This can be a DHCP or (reverse) DNS issue. Could you post the output of the following commands:

hostname -d
hostname -f

Also please attach the following:

  • /var/log/saltboot.log
  • /var/log/dhcp-hook.log (if present)
  • output of: "journalctl -t dhcp-hook" (if previous is missing)

View solution in original post

5 REPLIES 5

avatar
Contributor

I confirm, that this is only an issue, when my VNET is using custom DNS (like those provided from AADDS).
CloudBreak 2.6 was using unbound service, and hosts could communicate with each other using "example.com".
Seems like it's not the case anymore, or there's a missing configuration.

This is a massive blocker for us.

avatar
Expert Contributor

Hi @Jakub Igla,

it looks like your virtual machines haven't got an hostname/domain name. This can be a DHCP or (reverse) DNS issue. Could you post the output of the following commands:

hostname -d
hostname -f

Also please attach the following:

  • /var/log/saltboot.log
  • /var/log/dhcp-hook.log (if present)
  • output of: "journalctl -t dhcp-hook" (if previous is missing)

avatar
Contributor

Hi @mmolnar,

Thanks for getting back to me as I'm under huge pressure due to deadlines.

This time I've named my test cluster "asdtesttest" (I know not descriptive name)

hostname -d

Returns nothing

hostname -f

Returns: asdtesttest-m0

Files you've asked are attached. logs.zip

avatar
Expert Contributor

Hi @Jakub Igla,

we had to deactivate setting example.com as the fallback domain for Azure, as there is an issue on Azure when sometimes we have to wait unknown time to get the domain name.

In your case it looks like, that for private network with custom DNS this would never happen.

I think you should try to set CB_HOST_DISCOVERY_CUSTOM_DOMAIN in your Profile under your deployment directory and restart cloudbreak with 'cbd restart':

export CB_HOST_DISCOVERY_CUSTOM_DOMAIN=test.com

This will setup all of your cluster with this domain. Hopefully it will override the waiting and you will have a functional cluster. Please let me know if this helped.

avatar
Contributor

Hi @mmolnar

I can confirm that after adding this env variable I don't have this issue anymore and I'm on cbd 2.7.1-rc.13

Thank you!