- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Infrastructure Architecture for HDFS/Hadoop
- Labels:
-
Apache Ambari
-
Apache Hadoop
Created 08-03-2017 06:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have 6 Node hadoop cluster( 1 Edge Node, 2 Master(Primary, Secondary) and 3 slave nodes running on Azure VM's. To each of the slave nodes we have attached 3 disks of 1 TB size each being mounted at /grid/master, /grid/data1, /grid/data2, /grid/data3 on master, slave1, slave2, and slave3 resp.
Our replication factor is 3 and We have specified directories as /grid/data1, /grid/data2 and /grid/data3 in Ambari for datanode directories and /grid/master1/hadoop/hdfs/namenode as namenode directories. But since other 3 mount points i.e. /grid/data2, /grid/data3 and /grid/master does not exist on slave1 so hadoop services have started to create these 3 folders on our local filesystem of slave node 1. Same is the case with rest of the 2 slave nodes. This is filling up our local filesystem very fast.
Is there any way to deal with this scenario? Are there any specific properties in Amabari which needs to be checked to prevent from this to happen? And since some the data (replicated or other) has already been occupied in local file system of different nodes, can we tackle this safely by backing up without loosing any data? is replication factor required to be changed to 1?
Could someone suggest any approach for handling these situations safely? Any help would be much appreciated.
Thanks
Rahul
Created 08-11-2017 02:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you've got the point, the dfs.datanode.dir in Ambari is the global setting, so it will assume every host would have these dirs(/grid/data1, /grid/data2, /grid/data3), in your case you need to create config group to suite your environment.
And there are two way to solve the existing data under your directory, but first let's increase the dfs.datanode.balance.bandwidthPerSec value(bytes/sec) compare to your network speed in HDFS setting by Ambari UI, this will help to speed up the progress.
The safe way is to decommission DataNodes and reconfig your group setting then recommission the node one by one
https://community.hortonworks.com/articles/69364/decommission-and-reconfigure-data-node-disks.html
the unsafe is to reconfig the setting and remove the directory directly based on your replication setting, then wait for replicate complete by check hdfs dfsadmin -report command's under replicated blocks value to 0.
Created 08-03-2017 04:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It sounds as if your hosts have different configurations. To manage this using Ambari, you should investigate the Ambari Config Groups for hosts. You can group together like-hosts and apply configuration to them, while not forcing those configs to the outlier host. Here is a link to the documentation and to some relevant HCC threads.
https://community.hortonworks.com/questions/58846/amabri-config-groups.html
https://community.hortonworks.com/questions/73983/datanode-config-groups.html
Created 08-03-2017 05:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Sonu Sahi
Thanks for your reply.
Are you suggesting that we should create 4 HDFS config groups for master, slave1, slave2 and slave3 and provide dfs.datanode.dir as "/grid/data1, /grid/data2, /grid/data3" for slave1, slave2 and slave3 resp and each config group will have entries for its own node i.e. slave1 config group will just have data node directory as /grid/data1 etc?
This would make sure that hdfs data for slave1 would go into /grid/data1 and no data will go into /grid/data2 and /grid/data3 on slave1 and same is the case with other 2 slave nodes. And do we need to change replication factor as well?
Please correct me if i understood above incorrectly.
One more thing, if above scenario is the solution to our problem then what about already existing data under /grid/master, /grid/data2 and /grid/data3 on slave1? How to manage that data?
Thanks
Created 08-04-2017 12:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm gonna have some questions.
Q1. Did you install DataNode service on slave1?
Q2. Could you let me know values of DataNode directories on "Ambari > HDFS > Configs > Settings > DataNode"?
Q3, Did you check the disk mount list on slave?
Created 08-04-2017 05:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
disk-mount-point.png@Peter Kim
1) DataNode services has been installed in slave1, slave2 and slave3
2) Datanode directories are /grid/data1/hadoop/hdfs/data,/grid/data2/hadoop/hdfs/data,/grid/data3/hadoop/hdfs/data
3) Yes, I checked disk mount list on slaves. I have attached screenshot for slave1. we have disk been mounted on /grid/data1 as shown in snapshot.
Please let me know if anything else is required.
Thanks
Created 08-04-2017 07:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's a weird. Where is these partitions "/grid/data2, /grid/data3" on slave1?
Created 08-11-2017 02:03 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you've got the point, the dfs.datanode.dir in Ambari is the global setting, so it will assume every host would have these dirs(/grid/data1, /grid/data2, /grid/data3), in your case you need to create config group to suite your environment.
And there are two way to solve the existing data under your directory, but first let's increase the dfs.datanode.balance.bandwidthPerSec value(bytes/sec) compare to your network speed in HDFS setting by Ambari UI, this will help to speed up the progress.
The safe way is to decommission DataNodes and reconfig your group setting then recommission the node one by one
https://community.hortonworks.com/articles/69364/decommission-and-reconfigure-data-node-disks.html
the unsafe is to reconfig the setting and remove the directory directly based on your replication setting, then wait for replicate complete by check hdfs dfsadmin -report command's under replicated blocks value to 0.