Support Questions

Find answers, ask questions, and share your expertise

What is the term 'Vectorization' used while update/delete operations in Hive?

Rising Star
 
1 ACCEPTED SOLUTION

Can you paste the actual message? Normally Vectorization is Hive grouping together x ( 1024 ) records to run operations at them at once. This is much more efficient than doing operations row by row on modern CPUs because of cache settings and compiler optimizations.

https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution

Not sure about Update Deletes, they might use Vectorization for some functions there too.

View solution in original post

3 REPLIES 3

Can you paste the actual message? Normally Vectorization is Hive grouping together x ( 1024 ) records to run operations at them at once. This is much more efficient than doing operations row by row on modern CPUs because of cache settings and compiler optimizations.

https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution

Not sure about Update Deletes, they might use Vectorization for some functions there too.

Rising Star

I wanted to understand the concept for vectorization.I got it from above link.Thanks

Super Guru

I agree with @Benjamin Leonhardi. Provide the log file because vectorization occurs during opertaions like scans, filters, aggregates, and joins.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.