Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive Explode with array_index with no Cartesian product

Solved Go to solution

Hive Explode with array_index with no Cartesian product

New Contributor

Hi i have a Hive table


Hi i have a Hive table 
select a,b,c,d from riskfactor_table 
In the above table B, C and D columns are array columns. Below is my Hive DDL 
Create external table riskfactor_table 
(a string, 
b array<string>, 
c array<double>, 
d array<double> ) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '~'  
stored as textfile location 'user/riskfactor/data';  


Here is my table data:
ID400S,["jms","jndi","jaxb","jaxn"],[100,200,300,400],[1,2,3,4]
ID200N,["one","two","three"],[212,352,418],[6,10,8]
If i want to split array columns how can i split?
If i use explode function i can split array values for only one column
select explode(b) as b from riskfactor_table;
Output:
jms
jndi
jaxb
jxn
one
two
three
But i want all the columns to be populated using one select statement below-
Query - select a,b,c,d from risk_factor;
Output:
row1- ID400S jms 100 1
row2- ID400S jndi 200 2
row3- ID400S jaxb 300 3
row4- ID400S jaxn 400 4
How can i populate all the data?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hive Explode with array_index with no Cartesian product

@Sai Satish you can achieve it in this way

create table aTable(a int,b array<String>);insert into table aTable select 1,array('a','b') from dummyTable;hive> select * from aTable;
OK
1["a","b"]
1["a","b"]
select a,expl_tbl from aTable LATERAL VIEW explode(b) exploded_table as expl_tbl;
Query ID = hive_20160524053317_27cb538e-43e1-436f-8744-ec53a0c9d3b2
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1463989024283_0004)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 3.25 s
--------------------------------------------------------------------------------
OK
1a
1b
1a
1b
Time taken: 4.765 seconds, Fetched: 4 row(s)

4 REPLIES 4

Re: Hive Explode with array_index with no Cartesian product

@Sai Satish you can achieve it in this way

create table aTable(a int,b array<String>);insert into table aTable select 1,array('a','b') from dummyTable;hive> select * from aTable;
OK
1["a","b"]
1["a","b"]
select a,expl_tbl from aTable LATERAL VIEW explode(b) exploded_table as expl_tbl;
Query ID = hive_20160524053317_27cb538e-43e1-436f-8744-ec53a0c9d3b2
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1463989024283_0004)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 3.25 s
--------------------------------------------------------------------------------
OK
1a
1b
1a
1b
Time taken: 4.765 seconds, Fetched: 4 row(s)

Re: Hive Explode with array_index with no Cartesian product

New Contributor

Hi,

Thank you for your reply.

I am looking for multiple array columns solution.What if my table contains more than one array column if i use Lateral view explode in my Hive query it results Cartesian product.

If i have

10 array values in my Column1,

10 array values in Column2,

10 array values in Column3

then the result will be 1000 rows

which is wrong.I want the

result output also 10 rows.

I was checking in stack overflow (http://stackoverflow.com/questions/20667473/hive-explode-lateral-view-multiple-arrays)

i got some info using array_index and brickhouse jar we can do it but when i run the similar query in my Hive its not working. I am using Hive version is - 1.2.1

Any help ?

Re: Hive Explode with array_index with no Cartesian product

New Contributor

thanks for reply

Re: Hive Explode with array_index with no Cartesian product

New Contributor

@akshay singh @Sai Satish I am also facing same issue. Were you able to solve this?