Suppose we have 8 DataNodes with 128 GB RAM each of which 64 GB is allocated to Impala and we add 8 more nodes of 256 GB and we intend to allocated 128 GB to Impala.
My concern is, will the coordinator be smart enough to know the mem_limit of each node it sends fragments to? Are there any other known issues that come up with such a configuration?
NOTE: Current CDH version is CDH 5.4.4 (Impala v2.2)
Hi! Good question.
Today, Impala is not aware of the heterogeneity and will split the work evenly among all available nodes - regardless of how much cpu/memory those nodes have.
View solution in original post