Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to merge two rows having same values into single row in HIVE?

avatar

I am having a table called 'test_analysis' in that the values will be like following,

IDNAMEDOBLOCATION
1bob08/10/1985NEW JERSEY
1bob08/10/1985NEW YORK
1bob08/10/1985NORTH CAROLINA
2John26/11/1990OKLAHOMA

i want output like,

IDNAMEDOBLOCATION
1bob08/1/1985NEW JERSEY,NEW YORK,NORTH CAROLINA
2John26/11/1990OKLAHOMA

Please help me to form a hive query to get expected output.

5 REPLIES 5

avatar
New Contributor

Hi. collect_list should be able to achieve what you want:

select id
    ,name
    ,dob
    ,concat_ws(',',collect_list(location)) as location
from test_analysis
group by id
    ,name
    ,dob

I found the solution here: https://stackoverflow.com/questions/44770037/hive-pivot-and-sum

Output should look like this:

1       Bob     08/10/1985      New Jersey,New York,North Carolina
2       John    26/11/1990      Oklahoma

Hope this helps.

avatar
Explorer

Any solution for the below scenario

I have a scenario like below. can anyone please provide some info.

I have the below table. i tried doing this using join with lag and lead function but its not giving the expected result. I'm just not getting any clue how to implement the solution for this.

name date amount
abc 04/06/2018 100
abc 04/06/2018 200
abc 04/13/2018 300
now the output i needed is below.

name date amount
abc 04/06/2018 100
abc 04/06/2018 200
abc 04/13/2018 100
abc 04/13/2018 200
abc 04/13/2018 300

so here 300 is the new value and 100,200 from mon will also shown for tue. Do we have any way to do this in hive. Any help will be greatly appreciated.

avatar
Rising Star

avatar

Hi @johny gate

Below query works but its kind of dirty. Hope it Helps!!

select * from a
left join 
(select*,lag(col3)over (partition by col1 order by col2) as lag_val from a) tblb 
on tbl b.col1=a.col1 and a.col2=tblb.lag_val

avatar
New Contributor

Hello folks,

Please help me with the following query:

There are two tables T1 and T2 find the sum of price if customer buys all the product how much he has to pay after discount.

Table : T1

================================

ProductID | ProductName | Price

--------------------------------

1 | p1 | 1000

2 | p2 | 2000

3 | p3 | 3000

4 | p4 | 4000

5 | p5 | 5000

Table : T2

=======================

ProductID | Disocunt %

-----------------------

1 | 10

2 | 15

3 | 10

4 | 15

5 | 20

,

Hello everyone,

Please help me with the following query in Hive.

There are two tables T1 and T2 find the sum of price if customer buys all the product how much he has to pay after discount.

Table : T1

================================

ProductID | ProductName | Price

--------------------------------

1 | p1 | 1000

2 | p2 | 2000

3 | p3 | 3000

4 | p4 | 4000

5 | p5 | 5000

Table : T2

=======================
ProductID | Disocunt %

-----------------------

1 | 10

2 | 15

3 | 10

4 | 15

5 | 20