Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Hive Tez query is taking long time to run than exected


Hello Everyone - we have couple of queries which are simple create table or select queries taking long time.

we set the below parameters:

set hive.compute.query.using.stats=true;

set hive.stats.fetch.column.stats=true;

set hive.stats.fetch.partition.stats=true;

set hive.vectorized.execution.enabled =true;
set hive.vectorized.execution.reduce.enabled =true;

Still there is no siginificant change. First of all, what should i look oe where should i look on Tez UI to figure out what i taking time. When a user comes with questions saying - query taking loner time, what things i should check into.

Also - I dont understand Explain plan.

Please help me understand, what parameters of fields to look for.



Can someone please assist here

Master Guru


Good articles regards to tune Hive performance: Hive_performance_tune Tez_Performance_Tune . ExplainPlan

This is too broad question to answer, here are my thoughts:

1.Check is your HiveJob is getting started running in Resource manager(not in queue waiting for resources i.e Accepted state..etc)

2.Check in HDFS how many files are there in the table pointed directory, too many small files will result poor performance.

3.Try running hive console in debug mode to see where the job is taking time to execute.

4.Check is there any skew's in the data and create table stating all these skewed columns in the table properties.