Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Cloudbreak communication with public IPs

Super Guru

When does cloudbreak communicate with public IPs? I assume at some point it fetches the public repos. For example a security team does not want the process to communicate with any public IPs. What are the work arounds to handle this scenario on cloudbreak?

1 ACCEPTED SOLUTION

Super Guru

@rdoktorics thanks for that info. I need to know which public IPs does cloudbreak hits, what is it pulling, and if it can be prevented by using local repos instead.

View solution in original post

16 REPLIES 16

Expert Contributor

@Sunile Manjee What cloud are they using?

Super Guru

Expert Contributor

The next release will include a feature when you can deploy nodes into an existing subnet which not able to assign public ip-s to the machines. So the machines can reach the internet on a NAT gateway and you can login into the machines with VPN connection.

Super Guru

@rdoktorics thanks for that info. I need to know which public IPs does cloudbreak hits, what is it pulling, and if it can be prevented by using local repos instead.

Expert Contributor

@Sunile Manjee,

what do you mean under "which public IPs does cloudbreak hits"?

- Cloudbreak <-> internet?

- Cloudbreak <-> cluster?

- Installed cluster <-> internet?

So the question is, is it possible to create cluster with Cloudbreak without internet connection? Am i right?

In short it isn't possible. In long there would be too many limitations, and you have to prepare well your local repos and others.

Super Guru

@rkovacs basically can cloudbreak use local repository instead of fetching them. Do I understand correctly that is not possible?

Super Guru

@rkovacs @rdoktorics I need to know all the repos cloudbreak fetches and where does it fetch it from for security reasons.

Expert Contributor

@Sunile Manjee,

On the Cloudbreak side there are a two things which require internet connection:

- SSSD configuration

- Public recipes

So if you skip them Cloudbreak should works as well. Ambari does the others. In Cloudbreak you could configure HDP repository, so if you create a huge local repo which contains everything related to Ambari it should work. For more please ask Ambari team, because Ambari installs many things in runtime, for example updates and patches.

Super Guru

@rkovacs Where would the repos be loaded in advance? The only node which is static on cloudbreak is the deployer node. How would the instances launched by cloudbreak utilize repos which exist on the deployer node?

Expert Contributor

@Sunile Manjee Cloudbreak writes the configured repo next to Ambari if there is, and Ambari does the install on it's own way. We don't have repository on the deployer node (we don't have repos at all), you have to install yours or use public repo.

Super Guru

@rkovacs forgive me I am not following you. I don't see any option during a cluster deployment via cloudbreak to have ambari use local repos. Where may I find more (and clear) instructions on deploying a cluster via cloudbreak, configuring cluster to use local (not fetching repos) repos?

what does this mean: "Cloudbreak writes the configured repo next to Ambari if there is, and Ambari does the install on it's own way."

Expert Contributor

You should find under advanced options.

4013-screen-shot-2016-05-04-at-94327-pm.png

So if you create a local repository, Ambari could use it.

Super Guru

@rkovacs well there you go! Nice. so this solves 1 of 2. the second part is does cloudbreak reach out to any other public ips for any activity? basically if i create a VPC and harden the security group to have zero access to any public IP will the cluster launch and be usable? I guess I can test this as well.

Expert Contributor

@Sunile Manjee There is one more thing what you have to know. Cloudbreak deployer also requires internet connection for example

- in the first time when downloading dependencies and containers

- some commands like 'cbd doctor', cbd upgrade' and 'cbd version'

- during upgrade because it downloads dependencies and containers

So to cut totally from the internet you need an ami where everything is preconfigured with a fixed version of Cloudbreak and deployer.

Super Guru

@rkovacs that totally makes sense. however if I setup local repos like you showed, can cloudbreak (outside of the deployer stuff you explained above) launch a cluster without internet connection?

Expert Contributor

Yes, Cloudbreak can install a cluster without internet connection if you set the local repo, skip SSSD, and live with limitation of recipes. But somehow you have to install Cloudbreak itself, which is impossible withoout internet connection. Easiest way to install and manage Cloudbreak installation is to use Cloudbreak deployer. But some of the commands are dowloading content, like init, upgrade, version, doctor. So first you have to install deployer with internet connection, for example you create an AMI which contains an already installed deployer and cloudbreak, and than in the secured network you start an instance of the created AMI.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.