Support Questions

Find answers, ask questions, and share your expertise

Hive transpose concatenated data in columns to rows

avatar
New Contributor

I am new to hive and I need some help to transpose concatenated data in columns to rows. My hive table consists of subset of columns which contains concatenated values delimited by pipe or comma. I want to transpose it to rows using explode lateral view.

Input table

column1, column2, column3, column4

id1, Delta|Alpha|Beta, Source|Varied|Volume, AppData

Output table

column1, column2, column3, column4

id1, Delta, Source, AppData

id1, Alpha, Varied, AppData

id1, Beta, Volume, AppData

1 ACCEPTED SOLUTION

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy,

assuming you have both arrays with the same size (per record):

select 
  c1, 
  c21,
  c31,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
where c22=c32;

If array lengths can be different, then you need to add more conditions:

select c1, c222 c2, c333 c3, c4 
from (
select 
  c1, 
  c22, c32, -- keep indices
  case when c32 < size(c2) then c21 else null end c222,
  case when c22 < size(c3) then c31 else null end c333,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
) bar
where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);

Result:

+------+--------+---------+----------+--+
|  c1  |   c2   |   c3    |    c4    |
+------+--------+---------+----------+--+
| 100  | Delta  | Source  | AppData  |
| 100  | Alpha  | Varied  | AppData  |
| 100  | Beta   | Volume  | AppData  |
| 100  | Owner  | NULL    | AppData  |
+------+--------+---------+----------+--+

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy,

assuming you have both arrays with the same size (per record):

select 
  c1, 
  c21,
  c31,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
where c22=c32;

If array lengths can be different, then you need to add more conditions:

select c1, c222 c2, c333 c3, c4 
from (
select 
  c1, 
  c22, c32, -- keep indices
  case when c32 < size(c2) then c21 else null end c222,
  case when c22 < size(c3) then c31 else null end c333,
  c4 
from (
    select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1  as c22, c21
LATERAL VIEW posexplode(c3) n2  as c32, c31
) bar
where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);

Result:

+------+--------+---------+----------+--+
|  c1  |   c2   |   c3    |    c4    |
+------+--------+---------+----------+--+
| 100  | Delta  | Source  | AppData  |
| 100  | Alpha  | Varied  | AppData  |
| 100  | Beta   | Volume  | AppData  |
| 100  | Owner  | NULL    | AppData  |
+------+--------+---------+----------+--+

avatar
New Contributor

@Ed Berezitsky Thank you, this worked.

avatar
Super Collaborator

@Dhanya Kumar Heballi Shivamurthy, please accept the answer to close the thread.

avatar
New Contributor

1)does a cross -join happen between c21 and c32?

2)why did you use select 100?

3)What is owner?