Spark/Scala Error: value toDF is not a member of org.apache.spark.rdd.RDD

by Community Manager ‎03-29-2016 09:17 AM - edited ‎09-27-2016 09:19 AM

Q: I am trying to create a DataFrame of a text file which gives me error: "value toDF is not a member of org.apache.spark.rdd.RDD"


The only solution I can find online is to import SQLContext.implicits._ which in turn throws "not found: value SQLContext"


I googled this new error but couldn't find anything. The funny part is that the piece of code I am using works in Spark-Shell, but fails when I try to build it using sbt package

I am using Cloudera's QuickStart VM and my Spark Version is 1.3.0 and my Scala Version: 2.10.4 .



A: In this case 2 things needed to be done:


1- Import implicits:

      Note that this should be done only after an instance of org.apache.spark.sql.SQLContext is created. It should be written as:

      val sqlContext= new org.apache.spark.sql.SQLContext(sc)
      import sqlContext.implicits._



2- Move case class outside of the method:

      The case class, by use of which you define the schema of the DataFrame, should be defined outside of the method needing it. You can read more about it here:




by Aditya
on ‎08-18-2016 09:03 PM

One comment


In Spark shell the below packages SPARK imports by default when it starts up







But when working with the sbt the packages are not imprted by SPARK by default they needs to be imported.

Disclaimer: The information contained in this article was generated by third-parties and not by Cloudera or it's personnel. Cloudera cannot guarantee its accuracy or efficacy. Cloudera disclaims all warranties of any kind and users of this information assume all risk associated with it and with following the advice or directions contained herein. By visiting this page, you agree to be bound by the Terms and Conditions of Site Usage , including all disclaimers and limitations contained therein.