Created 05-22-2017 01:10 PM
I am new to hive and I need some help to transpose concatenated data in columns to rows. My hive table consists of subset of columns which contains concatenated values delimited by pipe or comma. I want to transpose it to rows using explode lateral view.
Input table
column1, column2, column3, column4
id1, Delta|Alpha|Beta, Source|Varied|Volume, AppData
Output table
column1, column2, column3, column4
id1, Delta, Source, AppData
id1, Alpha, Varied, AppData
id1, Beta, Volume, AppData
Created 05-23-2017 05:29 PM
@Dhanya Kumar Heballi Shivamurthy,
assuming you have both arrays with the same size (per record):
select c1, c21, c31, c4 from ( select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4 ) foo LATERAL VIEW posexplode(c2) n1 as c22, c21 LATERAL VIEW posexplode(c3) n2 as c32, c31 where c22=c32;
If array lengths can be different, then you need to add more conditions:
select c1, c222 c2, c333 c3, c4 from ( select c1, c22, c32, -- keep indices case when c32 < size(c2) then c21 else null end c222, case when c22 < size(c3) then c31 else null end c333, c4 from ( select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4 ) foo LATERAL VIEW posexplode(c2) n1 as c22, c21 LATERAL VIEW posexplode(c3) n2 as c32, c31 ) bar where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);
Result:
+------+--------+---------+----------+--+ | c1 | c2 | c3 | c4 | +------+--------+---------+----------+--+ | 100 | Delta | Source | AppData | | 100 | Alpha | Varied | AppData | | 100 | Beta | Volume | AppData | | 100 | Owner | NULL | AppData | +------+--------+---------+----------+--+
Created 05-23-2017 05:29 PM
@Dhanya Kumar Heballi Shivamurthy,
assuming you have both arrays with the same size (per record):
select c1, c21, c31, c4 from ( select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4 ) foo LATERAL VIEW posexplode(c2) n1 as c22, c21 LATERAL VIEW posexplode(c3) n2 as c32, c31 where c22=c32;
If array lengths can be different, then you need to add more conditions:
select c1, c222 c2, c333 c3, c4 from ( select c1, c22, c32, -- keep indices case when c32 < size(c2) then c21 else null end c222, case when c22 < size(c3) then c31 else null end c333, c4 from ( select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4 ) foo LATERAL VIEW posexplode(c2) n1 as c22, c21 LATERAL VIEW posexplode(c3) n2 as c32, c31 ) bar where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);
Result:
+------+--------+---------+----------+--+ | c1 | c2 | c3 | c4 | +------+--------+---------+----------+--+ | 100 | Delta | Source | AppData | | 100 | Alpha | Varied | AppData | | 100 | Beta | Volume | AppData | | 100 | Owner | NULL | AppData | +------+--------+---------+----------+--+
Created 05-25-2017 10:23 AM
@Ed Berezitsky Thank you, this worked.
Created 05-25-2017 01:27 PM
@Dhanya Kumar Heballi Shivamurthy, please accept the answer to close the thread.
Created 08-28-2018 08:35 AM
1)does a cross -join happen between c21 and c32?
2)why did you use select 100?
3)What is owner?