Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cloudera-Director bootstrap stuck waiting for Cloudera Manager instance to reboot

avatar
New Contributor

I'm trying to launch a cluster using Cloudera Director.

 

First, I set up a VPC on AWS following [Setting up the AWS Environment](http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_a...

 

Here was my IAM policy:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "Stmt1430423117980",
          "Action": [
            "ec2:CreateTags",
            "ec2:DescribeAvailabilityZones",
            "ec2:DescribeImages",
            "ec2:DescribeInstanceStatus",
            "ec2:DescribeInstances",
            "ec2:DescribeKeyPairs",
            "ec2:DescribePlacementGroups",
            "ec2:DescribeRegions",
            "ec2:DescribeSecurityGroups",
            "ec2:DescribeSubnets",
            "ec2:RunInstances",
            "ec2:TerminateInstances"
          ],
          "Effect": "Allow",
          "Resource": "arn:aws:ec2:us-east-1:334107771315:*"
        }
      ]
    }
    

 

 

Next I set up the instance for Director following  [Starting an Instance](http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_d...

 

I skipped the SOCKS proxy setup, this is only for testing.

 

I connected to my instance and did the following to install the Oracle JDK:

sudo su - root
yum list installed | grep jdk
yum remove java-1.6.0-openjdk.x86_64
yum remove java-1.7.0-openjdk.x86_64

wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.rpm"
sudo rpm -ivh jdk-8u45-linux-x64.rpm
java -version

 

Next, I installed Cloudera Director client following [Installing Cloudera Director client](http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_i...

 

At this point, I thought I was ready to run the director client, but I realized I needed a user, access key, and secret access key to put in the .conf file, so...

 

I created a user ("ben"), access key and secret access key (with a password) following [Configuring an Environment and Deploying a Cluster](http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_u...

 

I also created a policy and paste in the IAM policy json created earlier (so that's why we needed that!), and attach it to user "ben"

 

Now, I should be ready to run Director client....

 

$ cloudera-director bootstrap my.aws.simple.conf

Process logs can be found at /home/ec2-user/.cloudera-director/logs/application.log

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; support was removed in 8.0

Cloudera Director 1.1.2 initializing ...

Installing Cloudera Manager ...

* Starting .... done

* Requesting an instance for Cloudera Manager .............................. done

* Running custom bootstrap script on 10.0.0.236 ....... done

* Waiting for SSH access to 10.0.0.236 on port 22 ......... done

* Inspecting capabilities of 10.0.0.236 ..................... done

* Normalizing 10.0.0.236 ...... done

* Installing ntp (1/2) .... done

* Installing curl (2/2) ..................... done

* Mounting all instance disk drives .......... done

* Resizing instance root partition ......... done

* Rebooting 10.0.0.236 ..... done

* Waiting for 10.0.0.236 to boot .....

 

Over on AWS Console, the new instance that was created shows "1/2 checks passed".  Under "Instance Status Checks", it shows Instance reachability check failed.  I cannot ssh to the instance.  If I use the console to reboot it, it's the same result.  If I stop and then start it, the same result.  The System Log shows no issues and ends with:

 

Restarting system.

machine restart

 

I'm using instance types m3.xlarge on us-east-1, with ami-aed06ac6, which is the RHEL 6 ami from Amazon marketplace (RHEL 6.6). 

 

At this point, I'm not sure how to proceed.  I've stopped cloudera-director client several times, terminated the instance that was created, and started over "from scratch" but I always end up in the same state.  I don't understand how it is able to create an instance without any problem, but then it won't even reboot successfully.  My bootstrap script is the default aws.simple.conf "hello world" so it's not doing anything there to cause an issue.

 

Hopefully someone has seen something like this before and can help.

 

Thanks,

Ben

 

 

 

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Ben,

 

Can you give this a try with a RHEL 6.5 ami? RHEL 6.6 isn't supported by Director 1.1.2. 

http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_d...

 

The instance normalization process for RHEL 6.6 is different than normalization for 6.5. If you require 6.6 you may use a custom normalization script.

 

We are working on support for RHEL 6.6 in a future release.

 

David

 

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

Ben,

 

Can you give this a try with a RHEL 6.5 ami? RHEL 6.6 isn't supported by Director 1.1.2. 

http://www.cloudera.com/content/cloudera/en/documentation/cloudera-director/latest/topics/director_d...

 

The instance normalization process for RHEL 6.6 is different than normalization for 6.5. If you require 6.6 you may use a custom normalization script.

 

We are working on support for RHEL 6.6 in a future release.

 

David

 

avatar
New Contributor
Thanks, David, I knew it must be something like that.

Do you have an ami ID for CentOS 6.5 in the us-east-1 zone? I initially wanted to use CentOS, but when I search I get so many results none of which seem official. I've had trouble before running unofficial ami's that may have malware on them, so I had decided to use RHEL 6. I found the ami in the marketplace (which was just labelled RHEL 6, no specifically labelled 6.6).

Thanks,
Ben

avatar
Expert Contributor

The official CentOS 6.5 in us-east-1 is ami-8997afe0

 
Here's there marketplace page for CentOS 6.5:
You can see the amis for each region if you click "Continue", then the "Manual Launch" tab.
 
Here's the marketplace page for CentOS:
 
When searching for amis in the console, you can filter owner to"AWS Marketplace" to narrow your search a bit.
 
 

avatar
New Contributor

This worked for me.  I ended up using RHEL 6.5, because it already has the ec2-user account created and has more than a minimal set of packages (I'm sure CentOS would work as well with some additional setup).  I was able to launch my cluster successfully.

 

Thank you!