Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Load MYSQL table in to RDD

Rising Star

How can I load a complete table to an RDD using Spark.

1 ACCEPTED SOLUTION

Expert Contributor

There is a JDBC RDD function:

newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])

View solution in original post

2 REPLIES 2

I'm not aware of direct connector to MySQL. You could use Sqoop to ingest the contents of your table into HDFS then use the SparkContext's textFile() method to load it as an RDD.

Expert Contributor

There is a JDBC RDD function:

newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.