Member since
09-25-2018
1
Post
0
Kudos Received
0
Solutions
09-26-2018
08:14 AM
2 Kudos
@Srikanth t
The easiest approach is to use lateral views.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView
It allows you to split an array into multiple line.
1. Let's create an array from the items in your column "items" select key, split(items, ',') as valArray
from test
result
+------+---------------------------------------+--+
| key | _c1 |
+------+---------------------------------------+--+
| 22 | ["1001 abc"," 1002 pqr"," 1003 tuv"] |
| 33 | ["1004 def"," 1005 xyz"] |
+------+---------------------------------------+--+
2. Now let's use lateral view to split these items into lines (using "trim" to clean up the space) select key, trim(uniqueVal)
from(
select key, split(items, ',') as valArray
from test ) a lateral view explode(a.valArray) exploded as uniqueVal ;
+------+-----------+--+
| key | _c1 |
+------+-----------+--+
| 22 | 1001 abc |
| 22 | 1002 pqr |
| 22 | 1003 tuv |
| 33 | 1004 def |
| 33 | 1005 xyz |
+------+-----------+--+
3. Finally let's use split again to get separate values. select key, split(trim(uniqueVal), ' ')[0], split(trim(uniqueVal), ' ')[1]
from(
select key, split(items, ',') as valArray
from test
) a lateral view explode(a.valArray) exploded as uniqueVal ;
+------+-------+------+--+
| key | _c1 | _c2 |
+------+-------+------+--+
| 22 | 1001 | abc |
| 22 | 1002 | pqr |
| 22 | 1003 | tuv |
| 33 | 1004 | def |
| 33 | 1005 | xyz |
+------+-------+------+--+ Note : I used the following to create the table create table test (
key string,
value string )
STORED AS ORC ;
INSERT INTO test (key, value )
VALUES (22, '1001 abc, 1002 pqr, 1003 tuv'),
(33, '1004 def, 1005 xyz');
... View more