Would be there any impact (for ex, peformance impact on yarn resource allocation, impala queries performance) on running cluster -
Fewer Large instances vs Many Smaller instances for datanodes?
For example, Instead of running cluster with 20 d2.2xlarge instances, 10 d2.4xlarge instances makes any difference? d2.4xlarge configuration is more or less equal to twice the d2.2xlarge config.
20 d2.2xlarge cost is $ 22106.40/Month
10 d2.4xlarge cost is $ 21154.80/Month
I can see cost savings here and ease of managing fewer no. of datanodes. Kindly let me your thoughts.