Created on 03-13-2019 10:04 AM - edited 09-16-2022 07:13 AM
Trying to provision a CDH 6.1.1 cluster with Cloudera Altus Director 6.1 but failed First Run due to missing configuration values for "Default Group". Please see below for details:
The cluster.conf I'm using
name: platform-pod1-test-cdh provider { type: aws publishAccessKeys: false region: us-west-2 subnetId: subnet-08e9c59222473e2dd\ securityGroupsIds: sg-0b4f2c3cd77992a5 cluster-name: platform-pod1-test-cdh associatePublicIpAddresses: false rootVolumeSizeGB: 500 rootVolumeType: gp2 } ssh { username: centos privateKey: /home/centos/.ssh/cloudera.pem } common-instance-properties { image: ami-0825ff7b553db435 tags { owner: ${?USER} } normalizeInstance: false } instances { manager : ${common-instance-properties} { type: m4.large } master1 : ${common-instance-properties} { type: m4.large subnetId: subnet-08e9c59222473e2d } master2 : ${common-instance-properties} { type: m4.large subnetId: subnet-0a93e375408434cf } master3 : ${common-instance-properties} { type: m4.large subnetId: subnet-0ea21673c04c59a6 } slave1 : ${common-instance-properties} { type: m4.large subnetId: subnet-08e9c59222473e2d } slave2 : ${common-instance-properties} { type: m4.large subnetId: subnet-0a93e375408434cf } slave3 : ${common-instance-properties} { type: m4.large subnetId: subnet-0ea21673c04c59a6 } } cloudera-manager { instance: ${instances.manager} { tags { Name: platform-pod1-test-cdh-manager application: "Cloudera Manager 6" } } enableEnterpriseTrial: false databases { CLOUDERA_MANAGER { type: postgresql host: platform-pod1-test-rds.c57la45jl.us-west-2.rds.amazonaws.com port: 5432 user: scm password: scm name: scm } ACTIVITYMONITOR { type: postgresql host: platform-pod1-test-rds.c57la45jl.us-west-2.rds.amazonaws.com port: 5432 user: amon password: amon_password name: amon } REPORTSMANAGER { type: postgresql host: platform-pod1-test-rds.c57la45jl.us-west-2.rds.amazonaws.com port: 5432 user: rman password: rman_password name: rman } NAVIGATOR { type: postgresql host: platform-pod1-test-rds.c57la45jl.us-west-2.rds.amazonaws.com port: 5432 user: nav password: nav_password name: nav } NAVIGATORMETASERVER { type: postgresql host: platform-pod1-test-rds.c57la45jl.us-west-2.rds.amazonaws.com port: 5432 user: navms password: navms_password name: navms } } repository: "https://archive.cloudera.com/cm6/6.1/redhat7/yum/" repositoryKeyUrl: "https://archive.cloudera.com/cm6/6.1/redhat7/yum/RPM-GPG-KEY-cloudera" } cluster { products { CDH: 6 } parcelRepositories: ["http://archive.cloudera.com/cdh6/6.1/parcels/"] services: [HDFS, YARN, IMPALA, ZOOKEEPER, HBASE, HIVE, HUE, OOZIE, SPARK_ON_YARN, KAFKA, KUDU] configs { HDFS { dfs_ha_fencing_methods: "shell(true)" dfs_replication: 3 dfs_replication_min: 1 core_site_safety_valve: """ <property> <name>ipc.client.connect.max.retries.on.timeouts</name> <value>15</value> </property> """ } YARN { yarn_service_config_safety_valve: """ <property> <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name> <value>3600</value> </property> """ } KAFKA { "num.partitions": 6 "min.insync.replicas": 2 "default.replication.factor": 3 "unclean.leader.election.enable": false "zookeeper.session.timeout.ms": 20000 } KUDU { "superuser_acl": "*" } } databases { HIVE { type: postgresql host: platform-pod1-test-rds.c57la45jzutl.us-west-2.rds.amazonaws.com port: 5432 user: hive password: hive_password name: metastore }, HUE { type: postgresql host: platform-pod1-test-rds.c57la45jzutl.us-west-2.rds.amazonaws.com port: 5432 user: hue password: hue name: hue }, OOZIE { type: postgresql host: platform-pod1-test-rds.c57la45jzutl.us-west-2.rds.amazonaws.com port: 5432 user: oozie password: oozie name: oozie } } master1 { count: 1 instance: ${instances.master1} { tags { Name: platform-pod1-test-cdh-master1 group: master } } roles { ZOOKEEPER: [SERVER] HDFS: [NAMENODE, FAILOVERCONTROLLER, JOURNALNODE] YARN: [RESOURCEMANAGER] HBASE: [MASTER] SPARK_ON_YARN: [GATEWAY] HIVE: [HIVESERVER2, HIVEMETASTORE] KUDU: [KUDU_MASTER] } configs { HDFS { NAMENODE { dfs_federation_namenode_nameservice: hanameservice autofailover_enabled: true dfs_namenode_quorum_journal_name: hanameservice } } HBASE { MASTER { hbase_master_java_heapsize: 1024000000 } } YARN { RESOURCEMANAGER { yarn_scheduler_maximum_allocation_mb: 3072 yarn_scheduler_maximum_allocation_vcores: 2 yarn_resourcemanager_am_max_retries: 4 } } HIVE { HIVESERVER2 { hiveserver2_java_heapsize: 268435456 } HIVEMETASTORE { hive_metastore_java_heapsize: 268435456 } } KUDU { KUDU_MASTER { fs_wal_dir: "/data0/kudu/masterwal" fs_data_dirs: "/data1/kudu/master" } } } } master2 { count: 1 instance: ${instances.master2} { tags { Name: platform-pod1-test-cdh-master2 group: master } } roles { ZOOKEEPER: [SERVER] HDFS: [NAMENODE, FAILOVERCONTROLLER, JOURNALNODE] YARN: [JOBHISTORY, GATEWAY] HBASE: [MASTER] IMPALA: [CATALOGSERVER, STATESTORE] KUDU: [KUDU_MASTER] } configs { HDFS { NAMENODE { dfs_federation_namenode_nameservice: hanameservice autofailover_enabled: true dfs_namenode_quorum_journal_name: hanameservice } } YARN { GATEWAY { mapred_submit_replication: 2 } } HBASE { MASTER { hbase_master_java_heapsize: 1073741824 } } KUDU { KUDU_MASTER { fs_wal_dir: "/data0/kudu/masterwal" fs_data_dirs: "/data1/kudu/master" } } } } master3 { count: 1 instance: ${instances.master3} { tags { Name: platform-pod1-test-cdh-master3 group: master } } roles { ZOOKEEPER: [SERVER] HDFS: [JOURNALNODE, HTTPFS] YARN: [RESOURCEMANAGER] SPARK_ON_YARN: [SPARK_YARN_HISTORY_SERVER, GATEWAY] HBASE: [HBASETHRIFTSERVER] OOZIE: [OOZIE_SERVER] HUE: [HUE_SERVER] KUDU: [KUDU_MASTER] } configs { YARN { RESOURCEMANAGER { yarn_scheduler_maximum_allocation_mb: 3072 yarn_scheduler_maximum_allocation_vcores: 2 yarn_resourcemanager_am_max_retries: 4 } } HBASE { HBASETHRIFTSERVER { hbase_thriftserver_java_heapsize: 268435456 } } KUDU { KUDU_MASTER { fs_wal_dir: "/data0/kudu/masterwal" fs_data_dirs: "/data1/kudu/master" } } } } slave1 { count: 1 minCount: 1 instance: ${instances.slave1} { tags { Name: platform-pod1-test-cdh-slave1 group: slave } } roles { HDFS: [DATANODE] HBASE: [REGIONSERVER] YARN: [NODEMANAGER] KAFKA: [KAFKA_BROKER] KUDU: [KUDU_TSERVER] IMPALA: [IMPALAD] } configs { YARN { NODEMANAGER { yarn_nodemanager_resource_memory_mb: 10240 yarn_nodemanager_resource_cpu_vcores: 6 yarn_nodemanager_delete_debug_delay_sec: 5184000 } } HBASE { REGIONSERVER { hbase_regionserver_java_heapsize: 2147483648 } } KAFKA { KAFKA_BROKER { broker_max_heap_size: 1024 "log.dirs": "/data0/kafka/data" } } KUDU { KUDU_TSERVER { fs_wal_dir: "/data0/kudu/tabletwal" fs_data_dirs: "/data1/kudu/tablet" } } IMPALA { IMPALAD { impalad_memory_limit: 1610612736 } } } } slave2 { count: 1 minCount: 1 instance: ${instances.slave2} { tags { Name: platform-pod1-test-cdh-slave2 group: slave } } roles { HDFS: [DATANODE] HBASE: [REGIONSERVER] YARN: [NODEMANAGER] KAFKA: [KAFKA_BROKER] KUDU: [KUDU_TSERVER] IMPALA: [IMPALAD] } configs { YARN { NODEMANAGER { yarn_nodemanager_resource_memory_mb: 10240 yarn_nodemanager_resource_cpu_vcores: 6 yarn_nodemanager_delete_debug_delay_sec: 5184000 } } HBASE { REGIONSERVER { hbase_regionserver_java_heapsize: 2147483648 } } KAFKA { KAFKA_BROKER { broker_max_heap_size: 1024 "log.dirs": "/data0/kafka/data" } } KUDU { KUDU_TSERVER { fs_wal_dir: "/data0/kudu/tabletwal" fs_data_dirs: "/data1/kudu/tablet" } } IMPALA { IMPALAD { impalad_memory_limit: 1610612736 } } } } slave3 { count: 1 minCount: 1 instance: ${instances.slave3} { tags { Name: platform-pod1-test-cdh-slave3 group: slave } } roles { HDFS: [DATANODE] HBASE: [REGIONSERVER] YARN: [NODEMANAGER] KAFKA: [KAFKA_BROKER] KUDU: [KUDU_TSERVER] IMPALA: [IMPALAD] } configs { YARN { NODEMANAGER { yarn_nodemanager_resource_memory_mb: 10240 yarn_nodemanager_resource_cpu_vcores: 6 yarn_nodemanager_delete_debug_delay_sec: 5184000 } } HBASE { REGIONSERVER { hbase_regionserver_java_heapsize: 2147483648 } } KAFKA { KAFKA_BROKER { broker_max_heap_size: 1024 "log.dirs": "/data0/kafka/data" } } KUDU { KUDU_TSERVER { fs_wal_dir: "/data0/kudu/tabletwal" fs_data_dirs: "/data1/kudu/tablet" } } IMPALA { IMPALAD { impalad_memory_limit: 1610612736 } } } } }
I got the following exception during First Run:
Server Log :177) at com.cloudera.cmf.command.flow.SeqCmdWork.doWork(SeqCmdWork.java:107) at com.cloudera.cmf.command.flow.CmdStep.doWork(CmdStep.java:177) at com.cloudera.cmf.command.flow.SeqFlowCmd.run(SeqFlowCmd.java:117) ... 32 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.NullPointerException at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:1034) at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:987) at com.cloudera.cmf.service.AbstractClientConfigHandler.buildClientConfigFiles(AbstractClientConfigHandler.java:124) at com.cloudera.cmf.service.AbstractClientConfigHandler.buildClientConfigFiles(AbstractClientConfigHandler.java:110) at com.cloudera.cmf.service.components.ConfigHelper.getClientConfigs(ConfigHelper.java:147) at com.cloudera.cmf.service.AbstractRoleHandler.generateClientConfigsForDependencies(AbstractRoleHandler.java:1073) at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:1013) ... 43 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:1034) at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:987) at com.cloudera.cmf.service.AbstractClientConfigHandler.buildClientConfigFiles(AbstractClientConfigHandler.java:124) at com.cloudera.cmf.service.AbstractClientConfigHandler.buildClientConfigFiles(AbstractClientConfigHandler.java:110) at com.cloudera.cmf.service.components.ConfigHelper.getClientConfigs(ConfigHelper.java:147) at com.cloudera.cmf.service.AbstractRoleHandler.generateClientConfigsForDependencies(AbstractRoleHandler.java:1073) at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:1013) ... 49 more Caused by: java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191) at com.google.common.collect.Lists$TransformingSequentialList.<init>(Lists.java:527) at com.google.common.collect.Lists.transform(Lists.java:510) at com.cloudera.cmf.service.config.PathListEvaluators$PathListFileURIEvaluator.formatURIs(PathListEvaluators.java:110) at com.cloudera.cmf.service.config.NameserviceConfigsEvaluator.evaluateConfig(NameserviceConfigsEvaluator.java:93) at com.cloudera.cmf.service.config.AbstractGenericConfigEvaluator.evaluateConfig(AbstractGenericConfigEvaluator.java:45) at com.cloudera.cmf.service.config.HAorFederationConditionalEvaluator.evaluateConfig(HAorFederationConditionalEvaluator.java:98) at com.cloudera.cmf.service.config.AbstractGenericConfigEvaluator.evaluateConfig(AbstractGenericConfigEvaluator.java:45) at com.cloudera.cmf.service.config.AbstractConfigFileGenerator.generate(AbstractConfigFileGenerator.java:85) at com.cloudera.cmf.service.config.XMLConfigFileGenerator.generateConfigFileImpl(XMLConfigFileGenerator.java:148) at com.cloudera.cmf.service.config.AbstractConfigFileGenerator.generateConfigFile(AbstractConfigFileGenerator.java:136) at com.cloudera.cmf.service.HandlerUtil.generateConfigFiles(HandlerUtil.java:199) at com.cloudera.cmf.service.AbstractRoleHandler.generateConfigFiles(AbstractRoleHandler.java:1009) ... 55 more 2019-03-13 09:02:28,919 WARN CommandPusher:com.cloudera.cmf.command.flow.CmdStep: Unexpected exception during command work com.cloudera.cmf.command.CmdExecException: com.cloudera.cmf.command.CmdExecException: java.lang.RuntimeException: java.lang.RuntimeException:
And Manager shows "Configuration Invalid" for HDFS YARN and Kudu directories. I had to manually copy the values from specific groups to "Default Group". And everything works fine after I resume the First Run. Any idea why those are not populated to "Default Group"? Thanks!
Created 03-19-2019 08:30 AM
Hi WZ,
I have recently been reviewing the Director reference configuration files, and I had the same problem with "Default Master Group" for the Kudu configuration properties. I also got around it by copying the values to that master group. This does look like a problem with how Director is manipulating role configuration groups in Cloudera Manager.
I did not have the same issue with the namenode or node manager configuration properties, but the reference configuration does not set any of those. In your cluster, did you have the same "Default Master Group" problem with those roles as with the Kudu master?
Bill
Created on 03-19-2019 12:25 PM - edited 03-19-2019 12:26 PM
Hi Bill, Thanks for your reply!
-- In your cluster, did you have the same "Default Master Group" problem with those roles as with the Kudu master?
Actually, it's the opposite. YARN and HDFS are complaining about missing the config for specific groups and that's the cause of the nullpointer and failure of the first run. As a workaround, I added dir configs for namenode and node manager to director cluster config, and the first run can now be done successfully. KUDU still shows a warning but it does not break the first run.
```
Created 10-31-2019 07:26 PM