Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Tez View comes pre-deployed in Ambari, as part of Ambari User Views. The view contains useful textual and graphic analysis for Hive queries, when Hive is using Tez as the execution engine. Hortonworks Data Platform (HDP) offers two execution engines for Hive:

1) Tez

2) MapReduce

Tez is the default execution engine of Hive. Therefore, out-of-the-box, Tez View provides essential insight into Hive queries. This article will cover some of the useful features of Tez View to analyze/debug a Hive query.

Prerequisites

The videos below are using Ambari 2.4.1.0 with Hortonworks Data Platform 2.5.0. Ambari is pre-loaded with Tez View 0.7.0.2.5.0.0-22.

Executing a Hive Query

It all starts with executing a Hive query. This article will cover the TPC-DS query 98 for analysis.

select i_item_desc 
      ,i_category 
      ,i_class 
      ,i_current_price
      ,i_item_id
      ,sum(ss_ext_sales_price) as itemrevenue 
      ,sum(ss_ext_sales_price)*100/sum(sum(ss_ext_sales_price)) over
          (partition by i_class) as revenueratio
from	
	store_sales
    	,item 
    	,date_dim
where 
	store_sales.ss_item_sk = item.i_item_sk 
  	and i_category in ('Jewelry', 'Sports', 'Books')
  	and store_sales.ss_sold_date_sk = date_dim.d_date_sk
	and d_date between cast('2001-01-12' as date) 
				and (cast('2001-02-11' as date))
group by 
	i_item_id
        ,i_item_desc 
        ,i_category
        ,i_class
        ,i_current_price
order by 
	i_category
        ,i_class
        ,i_item_id
        ,i_item_desc

Query 98 is being executed within Hive View of Ambari:

When the query is executed, the execution engine Tez is creating vertices (mappers and reducers) to provide results.

Analyzing a Hive Query using Tez View

Then we access Tez View of Ambari – Tez creates DAGs (Directed Acyclic Graphs) that relate to both Hive and Pig. In our case, we choose the DAG for query 98 from the DAG Name column .

There are many statistics on the DAG Details tab that opens by default, such as Application ID (relating to the YARN application in which the Tez job ran), the submitter (who executed the query), Status (Failed, Succeeded, Running), Progress bar (% of completion), Start Time, End Time, and Duration.

Next we select the Graphical View tab, which represents the DAG, where each green vertex stands for Hive table(s). The mappers connected to the table(s) are extracting the rows from the tables. Reducers represent table joins and other running SQL functionality.

Highlight over a Vertex to view the Tez Class at each task. To view details of a vertex, simply select the vertex.

Lastly, select the Vertex Swimlane tab, which represents the total runtime of each vertex (mappers and reducers).

As demonstrated above, Tez View can be helpful when analyzing or debugging Hive queries.

5,944 Views
Comments
New Contributor

Detailed documentation about Tez View and debugging Hive Views is available:

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎01-27-2017 11:58 PM
Updated by:
 
Contributors
Top Kudoed Authors