Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Load MYSQL table in to RDD

avatar
Expert Contributor

How can I load a complete table to an RDD using Spark.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

There is a JDBC RDD function:

newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])

View solution in original post

2 REPLIES 2

avatar

I'm not aware of direct connector to MySQL. You could use Sqoop to ingest the contents of your table into HDFS then use the SparkContext's textFile() method to load it as an RDD.

avatar
Super Collaborator

There is a JDBC RDD function:

newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])