Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hortonworks Refernece Architecture on AWS private subnet

avatar

Hi,

I have to deploy multinode cluster on AWS. Is there any reference architecture available in order to deploy Hortonworks HDP cluster in the AWS? I can find ver detail Cloudera but could not find Hortonworks cluster deployment architecture. Please help me

1 ACCEPTED SOLUTION

avatar
@Farrukh Munir

Hortonworks offers Hortonworks Data Cloud and Cloudbreak for such scenarios in AWS.

If you would like to use HDC, you can find a reference architecture for AWS here and for Data Lake concept, here.

Hope this helps!

View solution in original post

8 REPLIES 8

avatar
@Farrukh Munir

Hortonworks offers Hortonworks Data Cloud and Cloudbreak for such scenarios in AWS.

If you would like to use HDC, you can find a reference architecture for AWS here and for Data Lake concept, here.

Hope this helps!

avatar

I am using AWS infrastructure as a service. Can I fully configure according to my own private subnet and EC2 instances? Is there any Template? Which version of HDP is available ?

avatar

@Farrukh Munir What is your use case?

Both Cloudbreak and HDC have the exact purpose to ease the deployment complexity, so I recommend going with one of them, unless you have some very specific reason not to do so. Cloudbreak offers deeper customization while HDC is easier to set up.

Both are deployed in an IaaS model, you can reuse your private subnet as well, you can check the available setups here. Both are using CloudFormation templates to bring up the stacks. HDP versions are fully customizable (either 2.4, 2.5 or 2.6) in Cloudbreak.

If you are happy with a more perspective approach then you can check this.

Hope this helps!

avatar

I have a very specific requirement. I cannot use any pre-load/built image due to security purpose. I want to install on fresh EC2 instances with our tailored security group. What is the best practice while having management nodes and in 10 node cluster? I want to put the whole cluster in private subnet and just want to allow users to access through edge nodes.

avatar

@Farrukh Munir

All the requirements that you specified (custom image, exsiting vpc, private subnet) can be fulfilled with Cloudbreak so afaik that is the best practice what Hortonworks supports, I am not aware of documented fully manual setup.

avatar

Thanks for your help. Once we create different resources such as;

  • templates
  • networks
  • security groups

When you create one of the above resources, Cloudbreak does not make any requests to AWS. Resources are only created on AWS after the create cluster button has pushed. These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure. Where is this database? Is it somewhere remotely or in same machine where it is installed?

avatar

@Farrukh Munir

All the resources created in Cloudbreak are saved to a Postgres database called cbdb running in a docker container called "cbreak_commondb_1". You can check the details of the container running the following under /var/lib/cloudbreak-deployment:

cbd ps

You can connect to the db via port 5432. The sensitive data is encrypted.

Hope this helps!

avatar

@Farrukh Munir If you think your original question was answered, would you consider accepting it? Thank you!