Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How I can measure reponse time for a query execution using hive as example?

Highlighted

How I can measure reponse time for a query execution using hive as example?

New Contributor

I'm a beginner with HDP and I'm trying to measure the response time when query execution, Are there a build In matrices where I can see how good was the execution of my query?

I'm running my query using Hive View on the Ambari UI.

Thanks

5 REPLIES 5

Re: How I can measure reponse time for a query execution using hive as example?

Super Guru

@Mohammed Syam

1.Click on Tez View underneath Hive View button will display the execution time for job.

46426-tez-view.png

(or)

2.You can view all information about logs and computation time in Resource manager UI

1.Click on Yarn in Ambari UI
2.Go to Quick Links 
3.Click on Resource Manager UI

Search for your application ID then you can get all logs and execution times.

3.You can also get all logs from command line using below command

yarn logs -applicationId <Application ID>

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_yarn-resource-management/content/ch_yarn...

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of errors.

Re: How I can measure reponse time for a query execution using hive as example?

New Contributor

Thank you @Shu

So I am running a select statment as example but I can't see it in the TEZ view.

I can see only the insert statmtnet I perfoemed earlier, Am I missing something here?


Re: How I can measure reponse time for a query execution using hive as example?

Super Guru

@Mohammed Syam

Select statement is just selecting the data(schema on read) from the HDFS it doesn't perform any map reduce job underneath.

But when you do insert statement it will trigger an Map only job with an application id, when ever an application id is created that means for those jobs we can view from TEZ view (or) from Resource Manager.

Example:-

Insert a record into Table:-

I'm having a text_table with 5 columns in it.

I'm trying to insert a record into the table and an app id application_1508861912312_4126, so when i go to TEZ view (or) Resource Manager search for the applicationid and we can view the application, time taken to complete the application.

hive# insert into text_table values('2','2017-11-10','bar','09:00:00','2017-12-23 09:00:00.0');
INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: insert into text_table values...09:00:00.0')(Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id application_1508861912312_4126)
INFO  : Map 1: -/-
INFO  : Map 1: 0/1
INFO  : Map 1: 0(+1)/1
INFO  : Map 1: 1/1
INFO  : Loading data to table default.text_table from hdfs:/user/yashu/text_table/.hive-staging_hive_2017-12-23_12-31-19_205_322427679820504794-22055/-ext-10000
INFO  : Table default.text_table stats: [numFiles=4, numRows=4, totalSize=184, rawDataSize=180]
No rows affected (11.679 seconds)

Tezview log:-

46427-tez-view.png

Select from Table:-

when we run select statement there is no application id got created and this job status we cannot view either from tez view (or) resource manager

hive# select * from text_table;
+-----+-------------+-------+-----------+------------------------+--+
| id  |     dt      | name  |    ts     |          dtts          |
+-----+-------------+-------+-----------+------------------------+--+
| 1   | 2017-10-10  | hcc   | 12:00:00  | 2017-10-10 12:00:00.0  |
| 1   | 2017-10-11  | foo   | 12:00:00  | 2017-10-11 12:00:00.0  |
| 1   | 2017-10-12  | foo   | 12:00:00  | 2017-10-12 12:00:00.0  |
| 2   | 2017-11-10  | bar   | 09:00:00  | 2017-12-23 09:00:00.0  |
+-----+-------------+-------+-----------+------------------------+--+

Please refer to below link for more details

https://community.hortonworks.com/questions/141606/hive-queries-use-only-mappers-or-only-reducers.ht...

Re: How I can measure reponse time for a query execution using hive as example?

New Contributor

Thank you @Shu

So if I want to compare the response time for retrieving data (select) between two SQL ON Hadoop engines, Is there a possible way that I can measure that?, How I can compare?

Re: How I can measure reponse time for a query execution using hive as example?

Super Guru

@Mohammed Syam

If you want to compare the response time of select queries

1.By using Ambari Hive View won't display the execution time of the query.

Hive View Execution:-

46432-hive-view.png

Logs Tab Hive View:-

46433-logs-hive-view.png

If you click on Logs tab also there is no logs for the query.

It is impossible compare the time taken for select query to execute, because for select queries there is no map reduce job is going to be initialized, if application id is not created then we cannot compare them programatically.

2.Another way to get at least the execution times for select query is running from hive from command line(not from ambari hive view)

Open Hive from command line and execute the select query.

Once select query finishes at bottom of the results hive shows how many rows has been selected with the response time will be displayed.

Example:-

hive# select * from text_table;
+----------------+----------------+------------------+----------------+------------------------+--+
| text_table.id  | text_table.dt  | text_table.name  | text_table.ts  |    text_table.dtts     |
+----------------+----------------+------------------+----------------+------------------------+--+
| 1              | 2017-10-10     | hcc              | 12:00:00       | 2017-10-10 12:00:00.0  |
| 1              | 2017-10-11     | foo              | 12:00:00       | 2017-10-11 12:00:00.0  |
| 1              | 2017-10-12     | foo              | 12:00:00       | 2017-10-12 12:00:00.0  |
| 2              | 2017-11-10     | bar              | 09:00:00       | 2017-12-23 09:00:00.0  |
+----------------+----------------+------------------+----------------+------------------------+--+
4 rows selected (0.105 seconds)

Scroll down to the last line in the above snippet and you can view that 4 rows and execution time is 0.105 seconds.

If you are running from Hive from command line hive displays how many rows are displayed and what is the execution time for the query as a last line in results.

If you are using Ambari Hive View won't display these stats also.

Only way to compare results of select query is Execute your select from Hive command lines and compare them manually.