I've been testing out Cloudera Director as an option as we plan to migrate a large on-prem cluster to AWS. I have some questions about the "clone cluster" functionality though as it relates to an already-configured cluster.
What I did was:
- Start a cluster ("cluster1") with 4 hosts. 2 HDFS "master" nodes and 2 "worker" nodes (with a DN role on a master, for triple replication). One master had a NameNode, one had a SecondaryNameNode
- Convert HDFS to High Availability. I checked, and this updated the instance groups in Cloudera Director so that the two groups (master1 and master2) each had a NameNode role, which is correct.
- Set a couple parameters in YARN ("mapreduce.job.reduce.slowstart.completedmaps", just for testing)
- Terminate cluster1
- Clone cluster1 to create a new cluster, "cluster2"
Upon spinup though, cluster2 died with the following errors:
- HDFS-1: There is more than one NameNode and none are configured with a nameservice
- HDFS-1: HDFS service not configured for High Availability must have a SecondaryNameNode
It seems like the clone cluster functionality copied the group roles, but not any of the configurations from the first cluster, confusing HDFS upon start.
My hope was that the clone cluster functionality would be a fast way to spin up a pre-configured cluster for new teams, testing, in a new region, or on a new version, but if the configuration is lost, that doesn't work well.
Is there a better CD feature for what I am trying to do? I'm happy to play around with the API interface if that tool is more powerful.
What version of Cloudera Manager are you on? Director should be picking up both service and role configurations as long as you are using Cloudera Manager 5.12 or higher. Also note that Director syncs up with CM every 5 minutes so you should ensure that you've waited that long to verify that the config changes were detected.
I was running CM 5.13.0. I may not have waited the 5 minutes for the config changes to sync though. I can try this later today and see.
To follow up -- from reading more documentation, it seems like the config-file based cluster spinup is a much better solution for us, so I've been using that instead of the web configuration view.