Support Questions
Find answers, ask questions, and share your expertise

How to Map a .csv file using CombineByKey and then reduced using ReduceByKey - Spark Scala

Hi experts,

Imagine that I've this example stored on HDFS in a .CSV file:

Stock_ID Sales_ID 1 A 2 A 3 C 3 D 4 A

I want to map the row using CombineByKey to list the elements and after that I wanto to reduce it to I can get the RDD that I expect. I only have this line:

  1. val textFile = sc.textFile("/input/transactions.csv")

    How can map and reduce it using Scala in Spark?

    Many thanks!!!


What Rdd do you expect?

Like this: {Stock_ID} -> {Sales_ID}: {1}->A {2}->A {3}->C,D {4}->A