Created 04-18-2017 09:31 AM
Hi,
To clarify the question I will illustrate the case;
Lets name datanodes like; [dnode1, dnode2, dnode3, dnode4, dnode5, dnode6, dnode7, dnode8, dnode9]
I don't want one block to be replicated among dnode1, dnode2 and dnode3 because I have to turn off these 3 at once in case of maintenance. Is there any such replication setting in hdfs so as to specify replication targets instead of random nodes? Like, replication group definition?
Created 04-18-2017 02:46 PM
Hi @Sedat Kestepe, take a look at rack awareness.
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/RackAwareness.html
Here's how you can configure racks using Ambari
https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Users_Guide/content/ch03s11.html
HDFS will avoid placing all block replicas in the same rack to avoid data loss in case of a rack failure. You may be able to use this to achieve what you want.
Created 04-18-2017 02:46 PM
Hi @Sedat Kestepe, take a look at rack awareness.
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/RackAwareness.html
Here's how you can configure racks using Ambari
https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Ambari_Users_Guide/content/ch03s11.html
HDFS will avoid placing all block replicas in the same rack to avoid data loss in case of a rack failure. You may be able to use this to achieve what you want.
Created 04-19-2017 01:24 PM
Thank you 🙂