Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

saveAsOrcFile is not a member of org.apache.spark.sql.DataFrame

avatar
Contributor

When I try Tutorial "A Lap around Apache Spark 1.3.1 with HDP 2.3" from sandbox, I encountered the problem:

scala> peopleSchemaRDD.saveAsOrcFile("people.orc")

<console>:41: error: value saveAsOrcFile is not a member of org.apache.spark.sql.DataFrame peopleSchemaRDD.saveAsOrcFile("people.orc") ^

1 ACCEPTED SOLUTION

avatar

. @wei yang Are you using Spark 1.3.1 or just the content of the tutorial? ORC support was added in Spark 1.4 (http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/)

Try using the following command

myDataFrame.write.format("orc").save("some_name")

View solution in original post

4 REPLIES 4

avatar
Master Guru

Are you sure that you are using the new sandbox and the Spark version is actually 1.3.1 or higher? It sounds like an error you would get in Spark 1.2

avatar

. @wei yang Are you using Spark 1.3.1 or just the content of the tutorial? ORC support was added in Spark 1.4 (http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/)

Try using the following command

myDataFrame.write.format("orc").save("some_name")

avatar
Contributor

I'm using Spark 1.4.1, and the command "peopleSchemaRDD.write.format("orc").save("people.orc")" works !!!

Thank you very much !

avatar
Master Mentor
sc.parallelize(records).toDF().write.format("orc").save("people")

that method was refactored. There's a new way of writing ORC files. Convert your RDD to DataFrame with toDF() and then write it out as above.Try to use later versions of Spark.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_spark-guide/content/ch_orc-spark.html