Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Appending to hive table gives an error but overwriting works why? error org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed

avatar

When i do

dfTrimmed.write.mode("overwrite").saveAsTable("table") it worksbut this dfTrimmed.write.mode("append").saveAsTable("table") gives an error 

error org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed

I am not sure why this is I am using spark 1.6.

I am inserting into a hive table while my dataframe was created through a hiveContext.

Thank you

3 REPLIES 3

avatar
Super Guru

@elliot gimple

Hive is not like a traditional RDBMS in regard to DML operations because of how Hive leverages HDFS to store data in files. Keep in mind that each partition has a file, each bucket adds another file and so on. When you perform a DML action against of a row, you practically overwrite a file, not append to a file. This is how HDFS has been architected for good reasons.

avatar
Expert Contributor

avatar
Expert Contributor

It should work like the following.

scala> Seq((1, 2)).toDF("i", "j").write.mode("overwrite").saveAsTable("t1")
scala> Seq((3, 4)).toDF("j", "i").write.mode("append").saveAsTable("t1")
scala> sql("select * from t1").show
+---+---+
|  i|  j|
+---+---+
|  1|  2|
|  4|  3|
+---+---+

According to your error message. The existing table `table` is not a hive table. Maybe, you created that table with that name by using `registerTempView` before. To create a table initially, use `saveAsTable` or `sql('CREATE TABLE...')` instead.