Support Questions
Find answers, ask questions, and share your expertise

saveAsOrcFile is not a member of org.apache.spark.sql.DataFrame

Explorer

When I try Tutorial "A Lap around Apache Spark 1.3.1 with HDP 2.3" from sandbox, I encountered the problem:

scala> peopleSchemaRDD.saveAsOrcFile("people.orc")

<console>:41: error: value saveAsOrcFile is not a member of org.apache.spark.sql.DataFrame peopleSchemaRDD.saveAsOrcFile("people.orc") ^

1 ACCEPTED SOLUTION

Accepted Solutions

. @wei yang Are you using Spark 1.3.1 or just the content of the tutorial? ORC support was added in Spark 1.4 (http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/)

Try using the following command

myDataFrame.write.format("orc").save("some_name")

View solution in original post

4 REPLIES 4

Are you sure that you are using the new sandbox and the Spark version is actually 1.3.1 or higher? It sounds like an error you would get in Spark 1.2

. @wei yang Are you using Spark 1.3.1 or just the content of the tutorial? ORC support was added in Spark 1.4 (http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/)

Try using the following command

myDataFrame.write.format("orc").save("some_name")

View solution in original post

Explorer

I'm using Spark 1.4.1, and the command "peopleSchemaRDD.write.format("orc").save("people.orc")" works !!!

Thank you very much !

Mentor
sc.parallelize(records).toDF().write.format("orc").save("people")

that method was refactored. There's a new way of writing ORC files. Convert your RDD to DataFrame with toDF() and then write it out as above.Try to use later versions of Spark.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_spark-guide/content/ch_orc-spark.html