When does cloudbreak communicate with public IPs? I assume at some point it fetches the public repos. For example a security team does not want the process to communicate with any public IPs. What are the work arounds to handle this scenario on cloudbreak?
The next release will include a feature when you can deploy nodes into an existing subnet which not able to assign public ip-s to the machines. So the machines can reach the internet on a NAT gateway and you can login into the machines with VPN connection.
what do you mean under "which public IPs does cloudbreak hits"?
- Cloudbreak <-> internet?
- Cloudbreak <-> cluster?
- Installed cluster <-> internet?
So the question is, is it possible to create cluster with Cloudbreak without internet connection? Am i right?
In short it isn't possible. In long there would be too many limitations, and you have to prepare well your local repos and others.
On the Cloudbreak side there are a two things which require internet connection:
- SSSD configuration
- Public recipes
So if you skip them Cloudbreak should works as well. Ambari does the others. In Cloudbreak you could configure HDP repository, so if you create a huge local repo which contains everything related to Ambari it should work. For more please ask Ambari team, because Ambari installs many things in runtime, for example updates and patches.
@rkovacs Where would the repos be loaded in advance? The only node which is static on cloudbreak is the deployer node. How would the instances launched by cloudbreak utilize repos which exist on the deployer node?