Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎09-09-2015

Intermiittent failure while standing up cluster (Requesting n instance(s) in n group(s) ...)

[ Edited ]

Hi

 

I'm using Cloudera Director client to stand up the cluster and intermittently the process is hanging after the manager has been created and the new isnstances have been requested.

 

The last log message is 

 

Requesting 9 instance(s) in 5 group(s) ...

 

When I go to AWS management console I see that the instances have been requested by that a number of them apparently haven't been named. I'm not sure if this stops CD from being able to connect to them, but as far as AWS is concerned they're fully up.

 

Is this a known issue with a workaround by any chance?

 

Thanks
Owen

Cloudera Employee
Posts: 26
Registered: ‎09-21-2015

Re: Intermiittent failure while standing up cluster (Requesting n instance(s) in n group(s) ...)

Hello,

 

Amazon had an outage on Sept 20th in the N. Virginia (us-east) region. Could you confirm if your instances were within this region? If yes, does it work smoothly again when you retry it?

 

If you continue to see failures, here are a couple things to check:

a) When this problem happens, in the AWS console are all the instances marked 'running', and are all the status checks passing?

b) Since you are using a configuration file with the Director client, could you try running this command to validate the file? "cloudera-director validate <conf file>"

 

If the above check passes and you still hit these failures, could you provide the following information to us (after removing any credentials or other private data from the files)

  1. What version of Director server and client are you running? (If you are on redhat, on the instance you can check with "rpm -qa | grep cloudera" which will list the server and client packages)
  2. Director server log (/var/log/cloudera-director-server/application.log)
  3. Director client log (In your home directory, ./cloudera-director/logs/application.log)
  4. A copy of the configuration file you used to bootstrap the cluster with (after removing all credentials from it)

Thanks,

Jayita

 

New Contributor
Posts: 5
Registered: ‎09-09-2015

Re: Intermiittent failure while standing up cluster (Requesting n instance(s) in n group(s) ...)

Thanks Jayita, I'll send all that in tomorrow.

We're in eu-west-1, the script works most of the time, its just a hand ful of times when its failed. the instances show as running, they just don't have names assigned.

Will dig out logs and post.

THanks
Owen