- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Joining 3 pair-RDDs
- Labels:
-
Apache Spark
Created on ‎08-14-2015 07:29 AM - edited ‎09-16-2022 02:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Using Spark, how can I join 3 pair-RDD?
I'm able to:
- populate 2 RDD (A and B)
- identify a common key and create 2 pair-RDD (A and B)
- perform a join on this key and get a 3rd RDD (C)
- populate a new RDD (D)
- identify a common key and create 2 pair-RDD again (C and D)
- perform a join on this key and get a 5th RDD (E)
So, to get a RDD joining the 3 files, I have to perform 2 joins.
Thanks 🙂
Greg.
Created ‎09-01-2015 12:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎08-26-2015 03:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How about using cogroup.?
Sparks' co group can work on 3 RDDs at once.
The below is scala cogroup syntax i have checked, it says, it can combine two RDDs other1 and other2 at the same time.
def cogroup[W1, W2](other1: RDD[(K, W1)], other2: RDD[(K, W2)]): RDD[(K, (Seq[V], Seq[W1], Seq[W2]))]
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
I cannot work on spark as i do not have set up at office, otherwise, would love to try this.
After cogroup, you can apply mapValues and merge the three sequences
Thank You.
Created ‎08-26-2015 04:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thanks for your reply, this is a very interesting functionality you have pointed out!
I will have a look at this and check if it also works for complex joins (like outer jons).
Greg.
Created ‎09-01-2015 12:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
