The exchange operators seem to be the bottleneck:
16:EXCHANGE 1 26m14s 26m14s 22.55K 0 0 0 KUDU(KuduPartition(shift_timekey))
13:EXCHANGE 1 26m13s 26m13s 21.46K -1 0 0 HASH(t.pnl_id,d.oper_code,d.factory)
12:EXCHANGE 1 24m17s 24m17s 108.37M -1 0 0 BROADCAST
Each take 20+ minutes.
More importantly, the profile says "WARNING: The following tables are missing relevant table and/or column statistics. rptpid.i_f_r_mes_hist_pnl_mod, rptpid.i_f_t_mes_hist_gop_inout_fab". You should compute stats on these tables. Having accurate statistics is important for Impala performance.