05-04-2017 05:24 AM
According to this documentation when running a query from a DataNode via impala-shell, the Impala daemon running on that node acts as the coordinator node for that query, but in theory all nodes with Impala daemons will work in parallel to transmit partial results back.
It seems though that in our cluster this is not working properly because it only uses 2% CPU and it takes a lot of time to complete queries.
Also, since CDH 5.10 the use of Llama role is deprecated, so what is the right way to manage Impala resources? Chaning CPU shares in the configuration seems to have no effect.
05-04-2017 07:18 AM
Go to Cloudera manager -> Host (select all hosts one by one) -> Resource (menu) and check CPU, Memory allocation for each service. You can customize it but please be mindful that there won't be any overlap for resource allocation
05-04-2017 11:56 PM