Created 05-24-2016 12:26 AM
Hi i have a Hive table
Hi i have a Hive table select a,b,c,d from riskfactor_table In the above table B, C and D columns are array columns. Below is my Hive DDL Create external table riskfactor_table (a string, b array<string>, c array<double>, d array<double> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '~' stored as textfile location 'user/riskfactor/data'; Here is my table data: ID400S,["jms","jndi","jaxb","jaxn"],[100,200,300,400],[1,2,3,4] ID200N,["one","two","three"],[212,352,418],[6,10,8] If i want to split array columns how can i split? If i use explode function i can split array values for only one column select explode(b) as b from riskfactor_table; Output: jms jndi jaxb jxn one two three But i want all the columns to be populated using one select statement below- Query - select a,b,c,d from risk_factor; Output: row1- ID400S jms 100 1 row2- ID400S jndi 200 2 row3- ID400S jaxb 300 3 row4- ID400S jaxn 400 4 How can i populate all the data?
Created 05-24-2016 05:36 AM
@Sai Satish you can achieve it in this way
create table aTable(a int,b array<String>);insert into table aTable select 1,array('a','b') from dummyTable;hive> select * from aTable; OK 1["a","b"] 1["a","b"] select a,expl_tbl from aTable LATERAL VIEW explode(b) exploded_table as expl_tbl; Query ID = hive_20160524053317_27cb538e-43e1-436f-8744-ec53a0c9d3b2 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1463989024283_0004) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 3.25 s -------------------------------------------------------------------------------- OK 1a 1b 1a 1b Time taken: 4.765 seconds, Fetched: 4 row(s)
Created 05-24-2016 05:36 AM
@Sai Satish you can achieve it in this way
create table aTable(a int,b array<String>);insert into table aTable select 1,array('a','b') from dummyTable;hive> select * from aTable; OK 1["a","b"] 1["a","b"] select a,expl_tbl from aTable LATERAL VIEW explode(b) exploded_table as expl_tbl; Query ID = hive_20160524053317_27cb538e-43e1-436f-8744-ec53a0c9d3b2 Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1463989024283_0004) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 3.25 s -------------------------------------------------------------------------------- OK 1a 1b 1a 1b Time taken: 4.765 seconds, Fetched: 4 row(s)
Created 05-24-2016 03:05 PM
Hi,
Thank you for your reply.
I am looking for multiple array columns solution.What if my table contains more than one array column if i use Lateral view explode in my Hive query it results Cartesian product.
If i have
10 array values in my Column1,
10 array values in Column2,
10 array values in Column3
then the result will be 1000 rows
which is wrong.I want the
result output also 10 rows.
I was checking in stack overflow (http://stackoverflow.com/questions/20667473/hive-explode-lateral-view-multiple-arrays)
i got some info using array_index and brickhouse jar we can do it but when i run the similar query in my Hive its not working. I am using Hive version is - 1.2.1
Any help ?
Created 12-09-2016 11:54 AM
thanks for reply
Created 07-05-2018 12:31 PM
@akshay singh @Sai Satish I am also facing same issue. Were you able to solve this?