Support Questions

Find answers, ask questions, and share your expertise

What is the term 'Vectorization' used while update/delete operations in Hive?

avatar
Expert Contributor
 
1 ACCEPTED SOLUTION

avatar
Master Guru

Can you paste the actual message? Normally Vectorization is Hive grouping together x ( 1024 ) records to run operations at them at once. This is much more efficient than doing operations row by row on modern CPUs because of cache settings and compiler optimizations.

https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution

Not sure about Update Deletes, they might use Vectorization for some functions there too.

View solution in original post

3 REPLIES 3

avatar
Master Guru

Can you paste the actual message? Normally Vectorization is Hive grouping together x ( 1024 ) records to run operations at them at once. This is much more efficient than doing operations row by row on modern CPUs because of cache settings and compiler optimizations.

https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution

Not sure about Update Deletes, they might use Vectorization for some functions there too.

avatar
Expert Contributor

I wanted to understand the concept for vectorization.I got it from above link.Thanks

avatar
Master Guru

I agree with @Benjamin Leonhardi. Provide the log file because vectorization occurs during opertaions like scans, filters, aggregates, and joins.