Support Questions
Find answers, ask questions, and share your expertise

Cloudera agent fails to start on machines due BIND server not up.

Highlighted

Cloudera agent fails to start on machines due BIND server not up.

Explorer

Sometimes when all vms with Cloudera managed clusters are rebooted or successful installation is restored from snapshot on vms, Cloudera agent fails to start.

I analyzed the cloudera-scm-agent.out log and saw that the Cloudera agent was down on some of the machines, and it was due to the fact that the ordering of BIND and Cloudera Agent in the boot process isn’t strictly ordered. Since Cloudera Agent requires the local BIND server to exist (because Cloudera Agent queries for the hostname of the machine, and the configured DNS server in resolv.conf is
127.0.0.1) the agent failed to start on some machines. Is there a way to configure Cloudera Agent to fail after some specific number of retries and timeout?

Don't have an account?