Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive transpose concatenated data in columns to rows

avatar
New Contributor

I am new to hive and I need some help to transpose concatenated data in columns to rows. My hive table consists of subset of columns which contains concatenated values delimited by pipe or comma. I want to transpose it to rows using explode lateral view.

Input table

column1, column2, column3, column4

id1, Delta|Alpha|Beta, Source|Varied|Volume, AppData

Output table

column1, column2, column3, column4

id1, Delta, Source, AppData

id1, Alpha, Varied, AppData

id1, Beta, Volume, AppData

1 ACCEPTED SOLUTION

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy,

assuming you have both arrays with the same size (per record):

select 
  c1, 
  c21,
  c31,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
where c22=c32;

If array lengths can be different, then you need to add more conditions:

select c1, c222 c2, c333 c3, c4 
from (
select 
  c1, 
  c22, c32, -- keep indices
  case when c32 < size(c2) then c21 else null end c222,
  case when c22 < size(c3) then c31 else null end c333,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
) bar
where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);

Result:

+------+--------+---------+----------+--+
|  c1  |   c2   |   c3    |    c4    |
+------+--------+---------+----------+--+
| 100  | Delta  | Source  | AppData  |
| 100  | Alpha  | Varied  | AppData  |
| 100  | Beta   | Volume  | AppData  |
| 100  | Owner  | NULL    | AppData  |
+------+--------+---------+----------+--+

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy,

assuming you have both arrays with the same size (per record):

select 
  c1, 
  c21,
  c31,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
where c22=c32;

If array lengths can be different, then you need to add more conditions:

select c1, c222 c2, c333 c3, c4 
from (
select 
  c1, 
  c22, c32, -- keep indices
  case when c32 < size(c2) then c21 else null end c222,
  case when c22 < size(c3) then c31 else null end c333,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
) bar
where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);

Result:

+------+--------+---------+----------+--+
|  c1  |   c2   |   c3    |    c4    |
+------+--------+---------+----------+--+
| 100  | Delta  | Source  | AppData  |
| 100  | Alpha  | Varied  | AppData  |
| 100  | Beta   | Volume  | AppData  |
| 100  | Owner  | NULL    | AppData  |
+------+--------+---------+----------+--+

avatar
New Contributor

@Ed Berezitsky Thank you, this worked.

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy, please accept the answer to close the thread.

avatar
New Contributor

1)does a cross -join happen between c21 and c32?

2)why did you use select 100?

3)What is owner?