Support Questions

rdevprasad1 · ‎07-27-2017

hive query is running for longer time, with 2 mappers and 2 Reducers. One of the reducer is taking1000 tasks and other reducer has only 1 task. Can you explain why the tasks are not evenly split between reducers

sreeviswa_athic · ‎07-28-2017

Time taking for Query execution depends on multiple factors

1. Mainly the Hive query design, joins and the columns being pulled

2. YARN/TEZ container size allocated, depends where you are running

3. Check the queue you are running your job, check if queue is free

to answer your question on why one of the reducer is taking 1000 tasks

please the hive.exec.reducers.max value defined

If you want to play and modify the number of reducers, try changing the value of hive.exec.reducers.bytes.per.reducer(preferably assign a smaller, as this value is inversely proportional to number of reducers)

View solution in original post

sreeviswa_athic · ‎07-28-2017