Is there a way to configure hodoop services, specifically impala, to be dedicated to specific cpu or RAM, so as to not use other hadoop service resources and bring the rest of the system and services down? From some stress test that I wasn't a part of, there is a belief that impala can bring the hadoop cluster down and or constrain other services, and I am hoping to find the approach for minimizing impact of poor impala requests on the rest of the system resources that support other services.
Until a solution is found, the platform team won't approve impala's use, although its recommended for our use cases: adhoc BI queries, drilling and analytics.
This can be achieved in two ways:
Use Cloudera Manager, go to Resource Management and define the resource pool. It will enforce the limit on the impala cpu or memory usages.
Also, you can use linux(cgroups or processor binding) to limit resource to impala user which will limit resource to impala.