Hive from MR was upgraded to tez with the latest upgrade to cdp 7 and we are seeing significant performance drop. Tried running the same query with static single hour partition just to observe outcome and it took 5hrs to finish whereas it used to complete within 4-5 hrs across the same ORC data set for all 24hrs partition.
INSERT OVERWRITE TABLE `user_tables`.`dummy_table` PARTITION(date_partition, hour_partition)
SELECT `(date_partition|hour_partition)?+.+`, to_date(srt.date_time) as date_partition, SUBSTR(srt.date_time, 12, 2) AS hour_partition
FROM `user_tables`.`source_dummy_table` srt
WHERE srt.date_partition BETWEEN "2022-04-05" AND date_add("2022-04-05", 4)
AND upper(srt.prop1) = "XYZ"
AND to_date(srt.date_time) BETWEEN "2022-04-05" AND "2022-04-05";
DAG shows for above:
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
1 container KILLED 136366 9112 0 127254 3 1258
----------------------------------------------------------------------------------------------
VERTICES: 00/01 [=>>-------------------------] 6% ELAPSED TIME: 39377.91 s
We are using dynamic partitioning because this used to work fine on mr. What memory parameters can be tweaked for tez to make it work because this time line is unrealistic and its a relatively powerful cluster with 34 nodes with enough cores/memory (3TB).
Below settings already added as per suggestion:
hive.exec.compress.intermediate=true
hive.intermediate.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
hive.intermediate.compression.type=BLOCK
hive.exec.parallel=true
hive.enforce.sorting=true
hive.exec.orc.split.strategy=BI
tez.grouping.max-size=67108864
tez.grouping.min-size=67108864
hive.merge.tezfiles=true
hive.merge.smallfiles.avgsize=67108864
hive.merge.size.per.task=134217728
tez.am.resource.memory.mb=16384
hive.tez.container.size=16384
Any help or suggestion is appreaciated.