Accepted Solution

How does Impala handle heterogeneous hardware?

Suppose we have 8 DataNodes with 128 GB RAM each of which 64 GB is allocated to Impala and we add 8 more nodes of 256 GB and we intend to allocated 128 GB to Impala.


My concern is, will the coordinator be smart enough to know the mem_limit of each node it sends fragments to? Are there any other known issues that come up with such a configuration?


NOTE: Current CDH version is CDH 5.4.4 (Impala v2.2)

Re: How does Impala handle heterogeneous hardware?

Hi! Good question.


Today, Impala is not aware of the heterogeneity and will split the work evenly among all available nodes - regardless of how much cpu/memory those nodes have.