Support Questions

Find answers, ask questions, and share your expertise

Hive Explode / Lateral View clarification

avatar
Expert Contributor

a)I am trying to understand below query in link:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView

SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

I know about explode but i did not understand the output of above query.Could anybody elaborate LATERAL view on this.

In manual:-

A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.

A lateral view first applies the UDTF to each row of base table means it will return

1

2

3

3

4

5

and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.? --Not clear

4 REPLIES 4

avatar
Expert Contributor

Any Input on this?

avatar
Super Collaborator

avatar
Expert Contributor

Thanks for input but still having clarification on below point.

A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias

A lateral view first applies the UDTF to each row of base table means it will return

1

2

3

3

4

5

and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.? --Not clear How it will Join? How 1 will match with [1,2,3]

source:-

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView

9306-lateral-view.jpg

avatar
Contributor

The right way to think about LATERAL VIEW is that it allows a table-generating function (UDTF) to be treated as a table source, so that it can be used like any other table, including selects, joins and more.

LATERAL VIEW is often used with explode, but explode is just one UDTF of many, a full list is available in the documentation.

To take an example:

select tf1.*, tf2.*
  from (select 0) t
  lateral view explode(map('A',10,'B',20,'C',30)) tf1
  lateral view explode(map('A',10,'B',20,'C',30)) tf2;

This results in:

tf1.key tf1.value tf2.key tf2.value
A 10 A 10
A 10 B 20
A 10 C 30
B 20 A 10

(5 rows were truncated)

The thing to see here is that this query is a cross product join between the tables tf1 and tf2. The LATERAL VIEW syntax allowed me to treat them as tables.

The original question used "AS" syntax, which automatically maps the generated table's columns to column aliases. In my view it is much more powerful to leave them as tables and use their fully qualified table correlation identifiers.

These tables can be used in joins as well:

select tf1.*, tf2.*
  from (select 0) t
  lateral view explode(map('A',10,'B',20,'C',30)) tf1
  lateral view explode(map('A',10,'B',20,'C',30)) tf2 where tf1.key = tf2.key;

Now we get:

tf1.key tf1.value tf2.key tf2.value
A 10 A 10
B 20 B 20
C 30 C 30