Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive query execution taking longer time

avatar
Contributor

hive query is running for longer time, with 2 mappers and 2 Reducers. One of the reducer is taking1000 tasks and other reducer has only 1 task. Can you explain why the tasks are not evenly split between reducers

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Time taking for Query execution depends on multiple factors

1. Mainly the Hive query design, joins and the columns being pulled

2. YARN/TEZ container size allocated, depends where you are running

3. Check the queue you are running your job, check if queue is free

to answer your question on why one of the reducer is taking 1000 tasks

please the hive.exec.reducers.max value defined

If you want to play and modify the number of reducers, try changing the value of hive.exec.reducers.bytes.per.reducer(preferably assign a smaller, as this value is inversely proportional to number of reducers)

View solution in original post

1 REPLY 1

avatar
Super Collaborator

Time taking for Query execution depends on multiple factors

1. Mainly the Hive query design, joins and the columns being pulled

2. YARN/TEZ container size allocated, depends where you are running

3. Check the queue you are running your job, check if queue is free

to answer your question on why one of the reducer is taking 1000 tasks

please the hive.exec.reducers.max value defined

If you want to play and modify the number of reducers, try changing the value of hive.exec.reducers.bytes.per.reducer(preferably assign a smaller, as this value is inversely proportional to number of reducers)