Is it possible to configure YARN on a data node to use half of cpu cores and half of total memory, and when necessary e.g. we are very busy, modify YARN configuration to expand to all available resources (cpu cores, memory)? And when not that busy, shrink back to half of the resouces?
One approach would be to have API-performed rolling restarts with config change-sets applied at each schedule. If latency (small drop in available nodes sustained for the duration of batched restarts) isn't an issue, this could work.
If locality isn't important, then you can also achieve this by marking a set of nodes as entirely unavailable via the NodeManager state support, leaving only the rest at full capacity.
This sounds like a scenario best served by Cloud (workload-driven cluster runtimes), offered by Cloudera Altus (or upcoming CDP) and/or Director.