Support Questions

TimothySpann · ‎05-31-2016

I've done so with a sqlContext.sql ("create table...") then a sqlContext.sql("insert into")

but a dataframe.write.orc will produce an ORC file that cannot be seen as hive.

What are all the ways to work with ORC from Spark?

jyadav · ‎05-31-2016

Did you tried with this syntax?

var Rddtb= objHiveContext.sql("select * from sample")
val dfTable = Rddtb.toDF()
dfTable.write.format("orc").mode(SaveMode.Overwrite).saveAsTable("db1.test1")

View solution in original post

jyadav · ‎05-31-2016

@Timothy Spann

Did you tried with this syntax?

var Rddtb= objHiveContext.sql("select * from sample")
val dfTable = Rddtb.toDF()
dfTable.write.format("orc").mode(SaveMode.Overwrite).saveAsTable("db1.test1")

TimothySpann · ‎05-31-2016

@Jitendra Yadav that worked for me in zeppelin and the data looks good.

mkumar13 · ‎06-08-2016

Answer to your Question "What are all the ways to work with ORC from Spark?"

I am using spark-sql and have created ORC table as well as other formats and found no issue.

uday_vakalapudi · ‎06-15-2016

The Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data.

It just like a File to store group of rows called stripes, along with auxiliary information in a file footer. It just a storage format, nothing to do with ORC/Spark.

Cloudera Community

Support Questions

Can you create a hive table in ORC Format from SparkSQL directly