Q: I am trying to create a DataFrame of a text file which gives me error: "value toDF is not a member of org.apache.spark.rdd.RDD"
The only solution I can find online is to import SQLContext.implicits._ which in turn throws "not found: value SQLContext"
I googled this new error but couldn't find anything. The funny part is that the piece of code I am using works in Spark-Shell, but fails when I try to build it using sbt package
I am using Cloudera's QuickStart VM and my Spark Version is 1.3.0 and my Scala Version: 2.10.4 .
A: In this case 2 things needed to be done:
1- Import implicits:
Note that this should be done only after an instance of org.apache.spark.sql.SQLContext is created. It should be written as:
val sqlContext= new org.apache.spark.sql.SQLContext(sc) import sqlContext.implicits._
2- Move case class outside of the method:
The case class, by use of which you define the schema of the DataFrame, should be defined outside of the method needing it. You can read more about it here:
In Spark shell the below packages SPARK imports by default when it starts up
But when working with the sbt the packages are not imprted by SPARK by default they needs to be imported.