I have a query, always failing with the following error:
Container exited with a non-zero exit code 1
]], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) [...]
The query itself is quite a small MERGE (Other much bigger queries work flawlessly):
MERGE INTO summary dst USING (
dst.id1 = src.id1
AND dst.id2 = src.id2
AND dst.id3 = src.id3
THEN UPDATE SET
name = src.name
The source table has 1.7M rows (50M on disk), the destination has 75M rows, (1.5GB on disk).
Both are ACID tables, ORC.
On the image, map 1 is the one with the issue, and I cannot understand why it has only one task. Naively I would think that more tasks would each have a smaller load and would work better, but I did not manage to do that.
Note that I maxed out already all memory parameters, I cannot do more on those: