I have a merge statement and was looking at how to make it faster. Inside the using part of the statement, there is a row_number() function to do some deduplication.
In the logs I see:
INFO physical.Vectorizer (:()) - Reduce vectorized: false
INFO physical.Vectorizer (:()) - Reduce notVectorizedReason: PTF operator: ROW_NUMBER not in supported functions [avg, count, dense_rank, first_value, last_value, max, min, rank, row_number, sum]
This log statement does not seem right: ROW_NUMBER not in [row_number] ?
I tried for the sake of my peace of mind with uppercase and lowercase row_number, but without any difference, luckily.
Is there anything I could do to get vectorisation and row_number together?
This is with hive 3.1.0 from HDP 3.1.4.