Thanks for this article. A followup question on the Ephemeral storage consideration for HDFS. We use Hortonworks cluster nodes(m4.4x large). Is it recommended to have a cluster of 10 data nodes out of which 5 are ephemerals and 5 EBS backed instances?. Aassume, 5 ephemeral nodes backed up to s3. What are the pros and cons would be, especially the data loss when there is 1.Crash of few ephemeral or EBS backed instances 2. AWS outage at this az 3. Region outage(DR plan in another region with S3 cross-region replication?).
... View more