Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive/Tez data not loading to table when using union all + lateral view

Highlighted

Hive/Tez data not loading to table when using union all + lateral view

New Contributor

When using union all with lateral view explode, the result data is not loaded to Hive table.

This issue only happens with Tez, not MapReduce.


Not working for Tez

hive> create table t1 as

> SELECT

> key, value

> FROM (

> SELECT MAP('A', '1', 'B', '2') as event

>

> UNION ALL

>

> SELECT MAP('A', '3', 'B', '4') as event

> ) s

> LATERAL VIEW EXPLODE( s.event ) t AS key, value;

Query ID = jinyang_li_20190516053833_43773273-2986-44a3-a827-4a7be1c3c42b

Total jobs = 1

Launching Job 1 out of 1

Status: Running (Executing on YARN cluster with App id application_1556926264520_11548)


----------------------------------------------------------------------------------------------

VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED

----------------------------------------------------------------------------------------------

Map 1 .......... container SUCCEEDED 1 1 0 0 0 0

Map 3 .......... container SUCCEEDED 1 1 0 0 0 0

----------------------------------------------------------------------------------------------

VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 3.92 s

----------------------------------------------------------------------------------------------

OK

Time taken: 5.905 seconds

hive> select * from t1;

OK

Time taken: 1.495 seconds




Works for MR

hive> set hive.execution.engine=mr;

Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

hive> create table t2 as

> SELECT

> key, value

> FROM (

> SELECT MAP('A', '1', 'B', '2') as event

>

> UNION ALL

>

> SELECT MAP('A', '3', 'B', '4') as event

> ) s

> LATERAL VIEW EXPLODE( s.event ) t AS key, value;

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

Query ID = jinyang_li_20190516054003_ea895a58-5ec9-4790-aa0e-c9e1b420bbc1

Total jobs = 3

Launching Job 1 out of 3

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1556926264520_11549, Tracking URL = http://ip-172-21-182-163.ec2.internal:20888/proxy/application_1556926264520_11549/

Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1556926264520_11549

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0

2019-05-16 05:40:10,139 Stage-1 map = 0%, reduce = 0%

2019-05-16 05:40:15,411 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.36 sec

MapReduce Total cumulative CPU time: 2 seconds 360 msec

Ended Job = job_1556926264520_11549

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver.

Moving data to directory hdfs://ip-172-21-182-163.ec2.internal:8020/user/hive/warehouse/.hive-staging_hive_2019-05-16_05-40-03_842_1554057741424461900-1/-ext-10001

Moving data to directory hdfs://ip-172-21-182-163.ec2.internal:8020/user/hive/warehouse/t2

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Cumulative CPU: 2.36 sec HDFS Read: 335 HDFS Write: 82 SUCCESS

Total MapReduce CPU Time Spent: 2 seconds 360 msec

OK

Time taken: 13.535 seconds

hive> select * from t2;

OK

A 1

B 2

A 3

B 4

Time taken: 0.115 seconds, Fetched: 4 row(s)

1 REPLY 1

Re: Hive/Tez data not loading to table when using union all + lateral view

New Contributor

The query plan shows no Move operator, which should be responsible for loading data to table location.


hive> explain

> create table t3 as

> select

> key, value

> from (

> SELECT MAP('A', '1', 'B', '2') as event

> UNION ALL

> SELECT MAP('A', '3', 'B', '4') as event

> ) s

> LATERAL VIEW EXPLODE( s.event ) t AS key, value;

OK

Plan not optimized by CBO.


Vertex dependency in root stage

Map 1 <- Union 2 (CONTAINS)

Map 3 <- Union 2 (CONTAINS)


Stage-3

Create Table Operator:

name:default.t3

Stage-1

Union 2

<-Map 1 [CONTAINS]

File Output Operator [FS_12]

table:{"name:":"default.t3"}

Select Operator [SEL_11] (rows=4 width=8)

Output:["_col0","_col1"]

Lateral View Join Operator [LVJ_10] (rows=4 width=810)

Output:["_col1","_col2"]

Select Operator [SEL_7] (rows=2 width=1080)

Lateral View Forward [LVF_6] (rows=2 width=540)

Select Operator [SEL_1] (rows=1 width=540)

Output:["_col0"]

TableScan [TS_0] (rows=1 width=1)

File Output Operator [FS_12]

table:{"name:":"default.t3"}

Select Operator [SEL_11] (rows=4 width=8)

Output:["_col0","_col1"]

Lateral View Join Operator [LVJ_10] (rows=4 width=810)

Output:["_col1","_col2"]

UDTF Operator [UDTF_9] (rows=2 width=540)

function name:explode

Select Operator [SEL_8] (rows=2 width=540)

Output:["_col0"]

Please refer to the previous Lateral View Forward [LVF_6]

<-Map 3 [CONTAINS]

File Output Operator [FS_12]

table:{"name:":"default.t3"}

Select Operator [SEL_11] (rows=4 width=8)

Output:["_col0","_col1"]

Lateral View Join Operator [LVJ_10] (rows=4 width=810)

Output:["_col1","_col2"]

Select Operator [SEL_7] (rows=2 width=1080)

Lateral View Forward [LVF_6] (rows=2 width=540)

Select Operator [SEL_3] (rows=1 width=540)

Output:["_col0"]

TableScan [TS_2] (rows=1 width=1)

File Output Operator [FS_12]

table:{"name:":"default.t3"}

Select Operator [SEL_11] (rows=4 width=8)

Output:["_col0","_col1"]

Lateral View Join Operator [LVJ_10] (rows=4 width=810)

Output:["_col1","_col2"]

UDTF Operator [UDTF_9] (rows=2 width=540)

function name:explode

Select Operator [SEL_8] (rows=2 width=540)

Output:["_col0"]

Please refer to the previous Lateral View Forward [LVF_6]

Don't have an account?
Coming from Hortonworks? Activate your account here