Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Performance issue with HIVE

Highlighted

Performance issue with HIVE

We are using HDP 2.6. We are new to technology. We have not done any advanced performance tuning settings. We are executing below queries -

	create table customer_partitioned
	(id int, name string, email string, state string)
	partitioned by (signup date)
	clustered by (id) into 2 buckets stored as orc
	tblproperties("transactional"="true");  -- 5 Seconds 
Insert into table customer_partitioned PARTITION(signup)
values (1,'Prathamesh','ph@gmail.com','MAH','2018-05-01');Insert into table customer_partitioned PARTITION(signup)
values (1,'Anirudh','ad@gmail.com','GOA','2018-05-02');Insert into table customer_partitioned PARTITION(signup)
values (1,'Sagar','ss@gmail.com','KER','2018-05-03');Insert into table customer_partitioned PARTITION(signup)
values (1,'Sumeet','sp@gmail.com','PUN','2018-05-04');Insert into table customer_partitioned PARTITION(signup)
values (1,'Rohit','rs@gmail.com','UP','2018-05-05'); 

-- It took 7 minutes to insert 5 rows.

2.

UPDATE customer_partitioned SET state = 'RAJ'WHERE customer_partitioned.name = 'Prathamesh';  -- 2 mins

3.

 DELETE FROM customer_partitioned WHERE
customer_partitioned.name = 'Rohit';  -- 7mins

Do we have to configure any parameters to get these run faster?

8 REPLIES 8
Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

Have a look at the various options available to speed execution of your hive queries 5 Ways to Make Your Hive Queries Run Faster

Hope that helps

Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

Have a look at the various options available to speed execution of your hive queries 5 Ways to Make Your Hive Queries Run Faster

Hope that helps

Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

Have a look at the various options available to speed execution of your hive queries 5 Ways to Make Your Hive Queries Run Faster

Hope that helps

Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

Any updates did your execution speed improve?

Re: Performance issue with HIVE

@Geoffrey Shelton Okot Those parameters were already set appropriately at global HIVE config. So performance mentioned was inherently using those parameters

Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

By default hive uses mr engine did you try using tez

set hive.execution.engine=tez;
Highlighted

Re: Performance issue with HIVE

@Geoffrey Shelton Okot

PFA HIVE config settings.

Also, for volume like 5 rows, why do we need to configure all these parameters.hive-parameters.zip

Highlighted

Re: Performance issue with HIVE

Mentor

@Anirudh D

If you are doing complex joins vectorized query execution improves the performance of operations like scans, aggregations, filters and joins. For 5 rows test the difference with and without the parameters

Most of those parameters you will need to test them to get the desired performance by timing the execution times.

Hope that helps

Don't have an account?
Coming from Hortonworks? Activate your account here