We will be setting up a HDP 3 cluster, and the idea is to split 1xRAID1 and 2xHDD through 2 virtual machines. 1xRAID1 would handle the OS and local user files, while each of the 2xHDD's would be used for hdfs /grid/.
My question: is 2xHDD enough to ensure file redundancy? I've read that Hadoop 3 does not require 3 replicas as it did in Hadoop 2, however, in case of a drive failure, would it still ensure that the files on the remaining HDD are enough for a full recovery?
Or would you advice partitioning some of the 1xRAID1 for a third /grid/ drive for the hdfs? In this case it seems to introduce a lot of overhead due to RAID.
What is the minimum amount of HDD's required for redundant hdfs operation?