Created on 01-27-2017 11:58 PM
Tez View comes pre-deployed in Ambari, as part of Ambari User Views. The view contains useful textual and graphic analysis for Hive queries, when Hive is using Tez as the execution engine. Hortonworks Data Platform (HDP) offers two execution engines for Hive:
1) Tez
2) MapReduce
Tez is the default execution engine of Hive. Therefore, out-of-the-box, Tez View provides essential insight into Hive queries. This article will cover some of the useful features of Tez View to analyze/debug a Hive query.
The videos below are using Ambari 2.4.1.0 with Hortonworks Data Platform 2.5.0. Ambari is pre-loaded with Tez View 0.7.0.2.5.0.0-22.
It all starts with executing a Hive query. This article will cover the TPC-DS query 98 for analysis.
select i_item_desc ,i_category ,i_class ,i_current_price ,i_item_id ,sum(ss_ext_sales_price) as itemrevenue ,sum(ss_ext_sales_price)*100/sum(sum(ss_ext_sales_price)) over (partition by i_class) as revenueratio from store_sales ,item ,date_dim where store_sales.ss_item_sk = item.i_item_sk and i_category in ('Jewelry', 'Sports', 'Books') and store_sales.ss_sold_date_sk = date_dim.d_date_sk and d_date between cast('2001-01-12' as date) and (cast('2001-02-11' as date)) group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc
Query 98 is being executed within Hive View of Ambari:
When the query is executed, the execution engine Tez is creating vertices (mappers and reducers) to provide results.
Then we access Tez View of Ambari – Tez creates DAGs (Directed Acyclic Graphs) that relate to both Hive and Pig. In our case, we choose the DAG for query 98 from the DAG Name column .
There are many statistics on the DAG Details tab that opens by default, such as Application ID (relating to the YARN application in which the Tez job ran), the submitter (who executed the query), Status (Failed, Succeeded, Running), Progress bar (% of completion), Start Time, End Time, and Duration.
Next we select the Graphical View tab, which represents the DAG, where each green vertex stands for Hive table(s). The mappers connected to the table(s) are extracting the rows from the tables. Reducers represent table joins and other running SQL functionality.
Highlight over a Vertex to view the Tez Class at each task. To view details of a vertex, simply select the vertex.
Lastly, select the Vertex Swimlane tab, which represents the total runtime of each vertex (mappers and reducers).
As demonstrated above, Tez View can be helpful when analyzing or debugging Hive queries.
Created on 02-03-2017 03:05 AM
Detailed documentation about Tez View and debugging Hive Views is available:
User | Count |
---|---|
758 | |
379 | |
316 | |
309 | |
268 |