Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive Tez query is taking long time to run than exected

Highlighted

Hive Tez query is taking long time to run than exected

New Contributor

Hello Everyone - we have couple of queries which are simple create table or select queries taking long time.

we set the below parameters:

set hive.compute.query.using.stats=true;

set hive.stats.fetch.column.stats=true;

set hive.stats.fetch.partition.stats=true;

set hive.vectorized.execution.enabled =true;
set hive.vectorized.execution.reduce.enabled =true;

Still there is no siginificant change. First of all, what should i look oe where should i look on Tez UI to figure out what i taking time. When a user comes with questions saying - query taking loner time, what things i should check into.

Also - I dont understand Explain plan.

Please help me understand, what parameters of fields to look for.


2 REPLIES 2

Re: Hive Tez query is taking long time to run than exected

New Contributor

Can someone please assist here

Re: Hive Tez query is taking long time to run than exected

Super Guru

@Bharath

Good articles regards to tune Hive performance: Hive_performance_tune Tez_Performance_Tune . ExplainPlan

This is too broad question to answer, here are my thoughts:

1.Check is your HiveJob is getting started running in Resource manager(not in queue waiting for resources i.e Accepted state..etc)

2.Check in HDFS how many files are there in the table pointed directory, too many small files will result poor performance.

3.Try running hive console in debug mode to see where the job is taking time to execute.

4.Check is there any skew's in the data and create table stating all these skewed columns in the table properties.