TL;DR; how to properly set up hive.tez.container.size for a job with wildly different steps?
I have a 8 data node hdp2.6 cluster, all data nodes are identical, with 32GB ram.
yarn.scheduler.maximum-allocation-mb is set up to the total server ram minus what is used by other services (OS, nodemanager...), ie. 20GB in my case,
yarn.scheduler.minimum-allocation-mb is set up to 1GB,
I am running only one hive MERGE statement, once per day, which has about 100k mappers.
If I set up hive.tez.container.size to 1GB, many mappers can run in parallel (faster query), but I will end up with one of those errors:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 3, vertexId=vertex_1510697553800_0993_2_03, diagnostics=[Task failed, taskId=task_1510697553800_0993_2_03_000150, diagnostics=[TaskAttempt 0 failed, info=[Container container_e102_1510697553800_0993_01_000042 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=32295,containerID=container_e102_1510697553800_0993_01_000042] is running beyond physical memory limits. Current usage: 5.4 GB of 5.3 GB physical memory used; 7.4 GB of 11.0 GB virtual memory used. Killing container.
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 3, vertexId=vertex_1511269090751_0011_2_03, diagnostics=[Exception in VertexManager, vertex:vertex_1511269090751_0011_2_03 [Reducer 3],org.apache.tez.dag.api.TezUncheckedException: Atleast 1 bipartite source should exist.
If I set up hive.tez.container.size to a bigger value I will run a lot less queries in parallel (longer query time) but eventually the query will succeed.
The thing is that I do not know in advance how big the data will be so even if I by trial and error find a good hive.tez.container.size it might not be good enough tomorrow, and maybe eventually my server memory will be too small . Further more, sizing for the worst case scenario feels like a waste of resource.
Is there any way to have a sort of dynamic tez container size to get a fast and succeeding query?