Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Hive query execution taking longer time

Explorer

hive query is running for longer time, with 2 mappers and 2 Reducers. One of the reducer is taking1000 tasks and other reducer has only 1 task. Can you explain why the tasks are not evenly split between reducers

1 ACCEPTED SOLUTION

Expert Contributor

Time taking for Query execution depends on multiple factors

1. Mainly the Hive query design, joins and the columns being pulled

2. YARN/TEZ container size allocated, depends where you are running

3. Check the queue you are running your job, check if queue is free

to answer your question on why one of the reducer is taking 1000 tasks

please the hive.exec.reducers.max value defined

If you want to play and modify the number of reducers, try changing the value of hive.exec.reducers.bytes.per.reducer(preferably assign a smaller, as this value is inversely proportional to number of reducers)

View solution in original post

1 REPLY 1

Expert Contributor

Time taking for Query execution depends on multiple factors

1. Mainly the Hive query design, joins and the columns being pulled

2. YARN/TEZ container size allocated, depends where you are running

3. Check the queue you are running your job, check if queue is free

to answer your question on why one of the reducer is taking 1000 tasks

please the hive.exec.reducers.max value defined

If you want to play and modify the number of reducers, try changing the value of hive.exec.reducers.bytes.per.reducer(preferably assign a smaller, as this value is inversely proportional to number of reducers)

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.