Solved: Re: Infrastructure Architecture for HDFS/Hadoop - Cloudera Community - 204156

Support Questions

Find answers, ask questions, and share your expertise

Infrastructure Architecture for HDFS/Hadoop

avatar
Rising Star

We have 6 Node hadoop cluster( 1 Edge Node, 2 Master(Primary, Secondary) and 3 slave nodes running on Azure VM's. To each of the slave nodes we have attached 3 disks of 1 TB size each being mounted at /grid/master, /grid/data1, /grid/data2, /grid/data3 on master, slave1, slave2, and slave3 resp.

Our replication factor is 3 and We have specified directories as /grid/data1, /grid/data2 and /grid/data3 in Ambari for datanode directories and /grid/master1/hadoop/hdfs/namenode as namenode directories. But since other 3 mount points i.e. /grid/data2, /grid/data3 and /grid/master does not exist on slave1 so hadoop services have started to create these 3 folders on our local filesystem of slave node 1. Same is the case with rest of the 2 slave nodes. This is filling up our local filesystem very fast.

Is there any way to deal with this scenario? Are there any specific properties in Amabari which needs to be checked to prevent from this to happen? And since some the data (replicated or other) has already been occupied in local file system of different nodes, can we tackle this safely by backing up without loosing any data? is replication factor required to be changed to 1?

Could someone suggest any approach for handling these situations safely? Any help would be much appreciated.

Thanks

Rahul

1 ACCEPTED SOLUTION

avatar
Explorer

I think you've got the point, the dfs.datanode.dir in Ambari is the global setting, so it will assume every host would have these dirs(/grid/data1, /grid/data2, /grid/data3), in your case you need to create config group to suite your environment.

And there are two way to solve the existing data under your directory, but first let's increase the dfs.datanode.balance.bandwidthPerSec value(bytes/sec) compare to your network speed in HDFS setting by Ambari UI, this will help to speed up the progress.

The safe way is to decommission DataNodes and reconfig your group setting then recommission the node one by one

https://community.hortonworks.com/articles/69364/decommission-and-reconfigure-data-node-disks.html

the unsafe is to reconfig the setting and remove the directory directly based on your replication setting, then wait for replicate complete by check hdfs dfsadmin -report command's under replicated blocks value to 0.

View solution in original post

6 REPLIES 6

avatar
Guru

Hi @rahul gulati

It sounds as if your hosts have different configurations. To manage this using Ambari, you should investigate the Ambari Config Groups for hosts. You can group together like-hosts and apply configuration to them, while not forcing those configs to the outlier host. Here is a link to the documentation and to some relevant HCC threads.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/using_host_con...

https://community.hortonworks.com/questions/58846/amabri-config-groups.html

https://community.hortonworks.com/questions/73983/datanode-config-groups.html

avatar
Rising Star

Hi @Sonu Sahi

Thanks for your reply.

Are you suggesting that we should create 4 HDFS config groups for master, slave1, slave2 and slave3 and provide dfs.datanode.dir as "/grid/data1, /grid/data2, /grid/data3" for slave1, slave2 and slave3 resp and each config group will have entries for its own node i.e. slave1 config group will just have data node directory as /grid/data1 etc?

This would make sure that hdfs data for slave1 would go into /grid/data1 and no data will go into /grid/data2 and /grid/data3 on slave1 and same is the case with other 2 slave nodes. And do we need to change replication factor as well?

Please correct me if i understood above incorrectly.

One more thing, if above scenario is the solution to our problem then what about already existing data under /grid/master, /grid/data2 and /grid/data3 on slave1? How to manage that data?

Thanks

avatar
Rising Star

I'm gonna have some questions.

Q1. Did you install DataNode service on slave1?

Q2. Could you let me know values of DataNode directories on "Ambari > HDFS > Configs > Settings > DataNode"?

Q3, Did you check the disk mount list on slave?

avatar
Rising Star

disk-mount-point.png@Peter Kim

1) DataNode services has been installed in slave1, slave2 and slave3

2) Datanode directories are /grid/data1/hadoop/hdfs/data,/grid/data2/hadoop/hdfs/data,/grid/data3/hadoop/hdfs/data

3) Yes, I checked disk mount list on slaves. I have attached screenshot for slave1. we have disk been mounted on /grid/data1 as shown in snapshot.

Please let me know if anything else is required.

Thanks

avatar
Rising Star

It's a weird. Where is these partitions "/grid/data2, /grid/data3" on slave1?

avatar
Explorer

I think you've got the point, the dfs.datanode.dir in Ambari is the global setting, so it will assume every host would have these dirs(/grid/data1, /grid/data2, /grid/data3), in your case you need to create config group to suite your environment.

And there are two way to solve the existing data under your directory, but first let's increase the dfs.datanode.balance.bandwidthPerSec value(bytes/sec) compare to your network speed in HDFS setting by Ambari UI, this will help to speed up the progress.

The safe way is to decommission DataNodes and reconfig your group setting then recommission the node one by one

https://community.hortonworks.com/articles/69364/decommission-and-reconfigure-data-node-disks.html

the unsafe is to reconfig the setting and remove the directory directly based on your replication setting, then wait for replicate complete by check hdfs dfsadmin -report command's under replicated blocks value to 0.