Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Community Manager

Q: I am trying to create a DataFrame of a text file which gives me error: "value toDF is not a member of org.apache.spark.rdd.RDD"

 

The only solution I can find online is to import SQLContext.implicits._ which in turn throws "not found: value SQLContext"

 

I googled this new error but couldn't find anything. The funny part is that the piece of code I am using works in Spark-Shell, but fails when I try to build it using sbt package

I am using Cloudera's QuickStart VM and my Spark Version is 1.3.0 and my Scala Version: 2.10.4 .

 

 

A: In this case 2 things needed to be done:

 

1- Import implicits:

      Note that this should be done only after an instance of org.apache.spark.sql.SQLContext is created. It should be written as:

      val sqlContext= new org.apache.spark.sql.SQLContext(sc)
      import sqlContext.implicits._

 

 

2- Move case class outside of the method:

      The case class, by use of which you define the schema of the DataFrame, should be defined outside of the method needing it. You can read more about it here:

      https://issues.scala-lang.org/browse/SI-6649

 

 

 

14,069 Views
0 Kudos
Comments
Explorer

One comment

 

In Spark shell the below packages SPARK imports by default when it starts up

scala.Predef._

org.apache.spark.SparkContext._

sqlContext.implicits._

sqlContext.sql

org.apache.spark.sql.functions._

 

But when working with the sbt the packages are not imprted by SPARK by default they needs to be imported.

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
5 of 5
Last update:
‎09-27-2016 09:19 AM
Updated by:
 
Top Kudoed Authors