Member since
09-14-2016
36
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2262 | 02-22-2017 10:35 PM |
06-21-2017
06:02 AM
@Sagar Morakhia I realize what I'm after is a row_num on A table. I actually want all from the join tables. but as more joins are added the main join key (e.g. A.key) keeps increasing. whats the format for joining to using unique A.key to others?? I'll try out some variations. Thanks for your help on this..
... View more
06-20-2017
02:45 PM
@Sagar Morakhia That doesn't seem to work. based on doc it shows below, but that also requires a group by. I might be missing something.
Select COUNT(B.b),B.key,ROW_NUMBER() OVER (partition by key) AS row_num from B)where row_num=1
... View more
06-19-2017
04:50 PM
@Sagar Morakhia Yeah that makes sense. The difference being if I want to count some other column in B (e.g. B.b). thats where group by is the only option. Is there a way to get the highest count via row_num vs doing where row_num = 1? most of the time you want to join to a many table, but do something with the repeating column. In our case we want to count by key..
... View more
06-18-2017
01:56 AM
@Sagar Morakhia Thanks for the response. Although that gives me the first matching row from B, what if I want other columns from B. for example B.b column is duplicated per A/B.key. Although with row_num = 1 - I only get the first one right? whats diff from just doing a GROUP BY on key? Is windowing faster? So in this regard windowing is helpful because I can control matching from the outer table (A) to all other child tables? e.g. the C join won't be based on the big table of A&B but instead on A.key. But this is still achievable with subquery and group by? correct me if i'm wrong...
... View more
06-16-2017
09:16 PM
Thanks for the response. I decided to do inner select inside the left join to aggregate the records as needed. Its not clear how the windowing function helps yet (it will help if you can descirbe what you get at each stage using that function) - I will try it out and get back here.
... View more
06-15-2017
04:14 PM
Hi all, We have multiple tables that need to be combined into a single table using left joins. There are many one to many relationships. Naturally - after the first join the subsequent join will produce duplicate rows. The end result is a massive table with mostly duplicates. I understand these can be removed easily in 2 ways. 1. doing a insert overwrite and selecting distinct rows. 2. group by on all final columns. Which of these is the optimal option?
Is there a pattern in Hive that will allow adding in additional tables and removing duplicates per table (instead of all in the end)... Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Hive
02-22-2017
10:35 PM
Unfortunately I'm not able to upgrade. Somewhat of a hack, but i put together an 'expect' script. In case someone else is stuck in similar situation. 🙂 #!/usr/bin/env expect
spawn ambari-server setup-security
sleep 0.5
set timeout 3
expect {
"*Choose one of the following options:*" { send "2\r" }
timeout
}
expect {
-re "Please provide master key for locking the credential store:" { send "dev\r" }
timeout
}
expect {
-re "Re-enter master key:" { send "dev\r" }
timeout
}
expect {
"Do you want to persist master key.*" { send "y\r" }
}
expect eof
This is obviously not great, but does work given no other option ...
... View more
02-22-2017
04:20 AM
Thanks for the response. we are actually stuck on 2.2.x
... View more
02-21-2017
11:10 PM
Has anyone been able to automate ambari-server commands ? e.g. ssl / encryption setup etc.
... View more
02-21-2017
04:48 AM
Nevermind...its only on server 🙂 it does seem to work, but doesn't show it in "op"
... View more