Created 02-08-2017 02:59 PM
How can I load a complete table to an RDD using Spark.
Created 02-08-2017 04:48 PM
There is a JDBC RDD function:
newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])
Created 02-08-2017 04:06 PM
I'm not aware of direct connector to MySQL. You could use Sqoop to ingest the contents of your table into HDFS then use the SparkContext's textFile() method to load it as an RDD.
Created 02-08-2017 04:48 PM
There is a JDBC RDD function:
newJdbcRDD(sc: SparkContext, getConnection: () ⇒ Connection, sql: String, lowerBound: Long, upperBound: Long, numPartitions: Int, mapRow: (ResultSet) ⇒ T = JdbcRDD.resultSetToObjectArray)(implicit arg0: ClassTag[T])