- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Install using CM of Datanodes with different number of JBOD disks.
- Labels:
-
Cloudera Manager
-
HDFS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts, in the the CDH install screens it has a Data Node configuration value:
DataNode Data Directory
dfs.data.dir, dfs.datanode.data.dir
It states to use comma-delimited list of directories on the local file system where the DataNode stores HDFS block data. Typical values are /data/N/dfs/dn for N = 1, 2, 3.... and each disk is a JBOD file mount. How do we specify this value if the datanodes have different number of JBOD disks say 20 disks in one and 10 disks in another Datanode. Since during install this is single global variable dfs.data.dir how does it allocate the 20 data directories in those data nodes with only 10 JBOD hard disks? Since there is no hostname defined in this variable to indicate different nunber of disks in different hosts. Also in future if new datanodes are added with different number of disks how is this specified while adding new data nodes?
I posted this question earlier but didnt get a reply so appreciate if you have some info.Thanks!
Created ‎07-09-2018 11:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When creating your cluster, Cloudera Manager should automatically detect the directories on each host, then use Role Configuration Groups to set distinct configurations for the 10-disk nodes and the 20-disk nodes, and divide roles appropriately between those groups.
dfs.data.dir isn't global, but is a role config, so it is usually set in the Role Config Group for a role.
You can read more about configuration management here:
https://www.cloudera.com/documentation/enterprise/latest/topics/cm_intro_primer.html#concept_fgj_tny...
When you add new datanodes, I suggest creating a host template and applying that to your new nodes, allowing them to easily join the correct DataNode group as well as any other roles you may be running on that node (like a YARN NodeManager). You can read about host templates here:
https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_host_templates.html
Thanks,
Darren
Created ‎07-09-2018 11:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When creating your cluster, Cloudera Manager should automatically detect the directories on each host, then use Role Configuration Groups to set distinct configurations for the 10-disk nodes and the 20-disk nodes, and divide roles appropriately between those groups.
dfs.data.dir isn't global, but is a role config, so it is usually set in the Role Config Group for a role.
You can read more about configuration management here:
https://www.cloudera.com/documentation/enterprise/latest/topics/cm_intro_primer.html#concept_fgj_tny...
When you add new datanodes, I suggest creating a host template and applying that to your new nodes, allowing them to easily join the correct DataNode group as well as any other roles you may be running on that node (like a YARN NodeManager). You can read about host templates here:
https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_host_templates.html
Thanks,
Darren
Created ‎07-09-2018 12:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
