Tez View comes pre-deployed in Ambari, as part of Ambari User Views. The view contains useful textual and graphic analysis for Hive queries, when Hive is using Tez as the execution engine. Hortonworks Data Platform (HDP) offers two execution engines for Hive:
Tez is the default execution engine of Hive. Therefore, out-of-the-box, Tez View provides essential insight into Hive queries. This article will cover some of the useful features of Tez View to analyze/debug a Hive query.
The videos below are using Ambari 22.214.171.124 with Hortonworks Data Platform 2.5.0. Ambari is pre-loaded with Tez View 0.7.0.2.5.0.0-22.
Executing a Hive Query
It all starts with executing a Hive query. This article will cover the TPC-DS query 98 for analysis.
,sum(ss_ext_sales_price) as itemrevenue
(partition by i_class) as revenueratio
store_sales.ss_item_sk = item.i_item_sk
and i_category in ('Jewelry', 'Sports', 'Books')
and store_sales.ss_sold_date_sk = date_dim.d_date_sk
and d_date between cast('2001-01-12' as date)
and (cast('2001-02-11' as date))
Query 98 is being executed within Hive View of Ambari:
When the query is executed, the execution engine Tez is creating vertices (mappers and reducers) to provide results.
Analyzing a Hive Query using Tez View
Then we access Tez View of Ambari – Tez creates DAGs (Directed Acyclic Graphs) that relate to both Hive and Pig. In our case, we choose the DAG for query 98 from the DAG Name column .
There are many statistics on the DAG Details tab that opens by default, such as Application ID (relating to the YARN application in which the Tez job ran), the submitter (who executed the query), Status (Failed, Succeeded, Running), Progress bar (% of completion), Start Time, End Time, and Duration.
Next we select the Graphical View tab, which represents the DAG, where each green vertex stands for Hive table(s). The mappers connected to the table(s) are extracting the rows from the tables. Reducers represent table joins and other running SQL functionality.
Highlight over a Vertex to view the Tez Class at each task. To view details of a vertex, simply select the vertex.
Lastly, select the Vertex Swimlane tab, which represents the total runtime of each vertex (mappers and reducers).
As demonstrated above, Tez View can be helpful when analyzing or debugging Hive queries.