Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using amazon ec2 with cloud break and docker

Solved Go to solution
Highlighted

Using amazon ec2 with cloud break and docker

Explorer

I am asking these question to get a overview :

1 Q) When using m4.4xlarge instance on amazon ec2 with Docker/Cloudbreak and deploying a max Hadoop HA blueprint does each master/slave section of the blueprint as defined will it take a individual m4.4xlarge(with quite few instances) or all will be in one m4.4xlarge instance by resource split in deployment?

2 Q)Using docker i believe we can spit the resources that a container can use out of a total resources on the underlying Linux OS and do a necessary installation right.? (Yes or no)

3 Q)And on amazon ec2 while doing HDP deployment using Cloudbreak

---step 1:) I believe we have to first install Cloudbreak in a small ec2 instance type (yes/no)

--- step2 :) and then in the GUI screens of Cloudbreak depending on the blueprint sections of master/slaves use the appropriate instance types(yes/no)

4 Q) Is there any complete documentation step by step guide or tutorial anybody can suggest that i can read and proceed to do a HDP installation using docker/cloudbreak on amazon ec2.

5) So in general depending on the hourly cost of instance types what are other typical costs that add up to deploy a HDP cluster on amazon ec2. Just as an example to get an idea are they any cost spread sheet available anywhere to see to deploy a minimum cluster on amazon ec2.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Using amazon ec2 with cloud break and docker

Contributor

First of all the latest release which was using Docker on public clouds (AWS, GCP and Azure) was 1.2.3. The 1.3.0 or newer versions are not using Docker to run the Hadoop services. Anyway for 1.2.3 the answers are:

1. Containers were started with net=host, thus there was one container per VM - Docker was mostly used for packaging and distribution - thus every node had one container. You needed as many nodes as the size of the cluster was.

2. You can but the container was getting the full VM resources (see #1)

3. You need to install the Cloudbreak application (anywhere, that can be an EC2 instance for example but on-prem as well). The Cloudbreak application - note it's not the cluster - is composed of several micro-services, and these are running inside containers. Can be GUI or CLI or API - every hostgroup can have different instance types, the cluster can be heteregenous.

4. http://sequenceiq.com/cloudbreak-docs/

5. It depends on the number of nodes you'd like to provision. There are no additional costs on top of EC price thus yo ucan do a fairly easy math - multiply the number of nodes you think your cluster will have with the number of hours ... In Cloudbreak you can fully track usage costs on the Accounts tab.

View solution in original post

1 REPLY 1

Re: Using amazon ec2 with cloud break and docker

Contributor

First of all the latest release which was using Docker on public clouds (AWS, GCP and Azure) was 1.2.3. The 1.3.0 or newer versions are not using Docker to run the Hadoop services. Anyway for 1.2.3 the answers are:

1. Containers were started with net=host, thus there was one container per VM - Docker was mostly used for packaging and distribution - thus every node had one container. You needed as many nodes as the size of the cluster was.

2. You can but the container was getting the full VM resources (see #1)

3. You need to install the Cloudbreak application (anywhere, that can be an EC2 instance for example but on-prem as well). The Cloudbreak application - note it's not the cluster - is composed of several micro-services, and these are running inside containers. Can be GUI or CLI or API - every hostgroup can have different instance types, the cluster can be heteregenous.

4. http://sequenceiq.com/cloudbreak-docs/

5. It depends on the number of nodes you'd like to provision. There are no additional costs on top of EC price thus yo ucan do a fairly easy math - multiply the number of nodes you think your cluster will have with the number of hours ... In Cloudbreak you can fully track usage costs on the Accounts tab.

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here