Created 12-23-2017 04:39 PM
I'm a beginner with HDP and I'm trying to measure the response time when query execution, Are there a build In matrices where I can see how good was the execution of my query?
I'm running my query using Hive View on the Ambari UI.
Thanks
Created on 12-23-2017 04:56 PM - edited 08-18-2019 02:53 AM
1.Click on Tez View underneath Hive View button will display the execution time for job.
(or)
2.You can view all information about logs and computation time in Resource manager UI
1.Click on Yarn in Ambari UI 2.Go to Quick Links 3.Click on Resource Manager UI
Search for your application ID then you can get all logs and execution times.
3.You can also get all logs from command line using below command
yarn logs -applicationId <Application ID>
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of errors.
Created 12-23-2017 05:19 PM
So I am running a select statment as example but I can't see it in the TEZ view.
I can see only the insert statmtnet I perfoemed earlier, Am I missing something here?
Created on 12-23-2017 05:45 PM - edited 08-18-2019 02:53 AM
Select statement is just selecting the data(schema on read) from the HDFS it doesn't perform any map reduce job underneath.
But when you do insert statement it will trigger an Map only job with an application id, when ever an application id is created that means for those jobs we can view from TEZ view (or) from Resource Manager.
Example:-
Insert a record into Table:-
I'm having a text_table with 5 columns in it.
I'm trying to insert a record into the table and an app id application_1508861912312_4126, so when i go to TEZ view (or) Resource Manager search for the applicationid and we can view the application, time taken to complete the application.
hive# insert into text_table values('2','2017-11-10','bar','09:00:00','2017-12-23 09:00:00.0'); INFO : Tez session hasn't been created yet. Opening session INFO : Dag name: insert into text_table values...09:00:00.0')(Stage-1) INFO : Status: Running (Executing on YARN cluster with App id application_1508861912312_4126) INFO : Map 1: -/- INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table default.text_table from hdfs:/user/yashu/text_table/.hive-staging_hive_2017-12-23_12-31-19_205_322427679820504794-22055/-ext-10000 INFO : Table default.text_table stats: [numFiles=4, numRows=4, totalSize=184, rawDataSize=180] No rows affected (11.679 seconds)
Tezview log:-
Select from Table:-
when we run select statement there is no application id got created and this job status we cannot view either from tez view (or) resource manager
hive# select * from text_table; +-----+-------------+-------+-----------+------------------------+--+ | id | dt | name | ts | dtts | +-----+-------------+-------+-----------+------------------------+--+ | 1 | 2017-10-10 | hcc | 12:00:00 | 2017-10-10 12:00:00.0 | | 1 | 2017-10-11 | foo | 12:00:00 | 2017-10-11 12:00:00.0 | | 1 | 2017-10-12 | foo | 12:00:00 | 2017-10-12 12:00:00.0 | | 2 | 2017-11-10 | bar | 09:00:00 | 2017-12-23 09:00:00.0 | +-----+-------------+-------+-----------+------------------------+--+
Please refer to below link for more details
Created 12-24-2017 03:23 PM
Thank you @Shu
So if I want to compare the response time for retrieving data (select) between two SQL ON Hadoop engines, Is there a possible way that I can measure that?, How I can compare?
Created on 12-24-2017 06:06 PM - edited 08-18-2019 02:53 AM
If you want to compare the response time of select queries
1.By using Ambari Hive View won't display the execution time of the query.
Hive View Execution:-
Logs Tab Hive View:-
If you click on Logs tab also there is no logs for the query.
It is impossible compare the time taken for select query to execute, because for select queries there is no map reduce job is going to be initialized, if application id is not created then we cannot compare them programatically.
2.Another way to get at least the execution times for select query is running from hive from command line(not from ambari hive view)
Open Hive from command line and execute the select query.
Once select query finishes at bottom of the results hive shows how many rows has been selected with the response time will be displayed.
Example:-
hive# select * from text_table; +----------------+----------------+------------------+----------------+------------------------+--+ | text_table.id | text_table.dt | text_table.name | text_table.ts | text_table.dtts | +----------------+----------------+------------------+----------------+------------------------+--+ | 1 | 2017-10-10 | hcc | 12:00:00 | 2017-10-10 12:00:00.0 | | 1 | 2017-10-11 | foo | 12:00:00 | 2017-10-11 12:00:00.0 | | 1 | 2017-10-12 | foo | 12:00:00 | 2017-10-12 12:00:00.0 | | 2 | 2017-11-10 | bar | 09:00:00 | 2017-12-23 09:00:00.0 | +----------------+----------------+------------------+----------------+------------------------+--+ 4 rows selected (0.105 seconds)
Scroll down to the last line in the above snippet and you can view that 4 rows and execution time is 0.105 seconds.
If you are running from Hive from command line hive displays how many rows are displayed and what is the execution time for the query as a last line in results.
If you are using Ambari Hive View won't display these stats also.
Only way to compare results of select query is Execute your select from Hive command lines and compare them manually.