I want to control the number of containers running in parallel for a query, so that I can many queries parallely in a yarn queue.
Yarn Queue Size: 200 GB
Approx. Mappers / Containers: 50
I setting container size at 10GB by setting hive.tez.container.size=10240;
Once the first query is triggered, the query consumes the whole queue (200 GB) and runs 20 containers parallely, and not allowing the other query to start due to unavailability of Yarn memory in the queue.
I want help in indentifying parameters to control the number of containers running in parallel, so thay I can limit to 10. So at any point a query will consume only max 100GB (10 Containers x 10 GB per container) of 200 GB yarn queue.