Hello Everyone - we have couple of queries which are simple create table or select queries taking long time.
we set the below parameters:
set hive.vectorized.execution.enabled =true;
set hive.vectorized.execution.reduce.enabled =true;
Still there is no siginificant change. First of all, what should i look oe where should i look on Tez UI to figure out what i taking time. When a user comes with questions saying - query taking loner time, what things i should check into.
Also - I dont understand Explain plan.
Please help me understand, what parameters of fields to look for.
This is too broad question to answer, here are my thoughts:
1.Check is your HiveJob is getting started running in Resource manager(not in queue waiting for resources i.e Accepted state..etc)
2.Check in HDFS how many files are there in the table pointed directory, too many small files will result poor performance.
3.Try running hive console in debug mode to see where the job is taking time to execute.
4.Check is there any skew's in the data and create table stating all these skewed columns in the table properties.