Support Questions

Report Inappropriate Content · ‎03-06-2017

When i do

dfTrimmed.write.mode("overwrite").saveAsTable("table") it worksbut this dfTrimmed.write.mode("append").saveAsTable("table") gives an error

error org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed

I am not sure why this is I am using spark 1.6.

I am inserting into a hive table while my dataframe was created through a hiveContext.

Thank you

cstanca · ‎03-08-2017

@elliot gimple

Hive is not like a traditional RDBMS in regard to DML operations because of how Hive leverages HDFS to store data in files. Keep in mind that each partition has a file, each bucket adds another file and so on. When you perform a DML action against of a row, you practically overwrite a file, not append to a file. This is how HDFS has been architected for good reasons.

dchiguruvad · ‎03-09-2017

@elliot gimple

seems this doc might help you.

https://forums.databricks.com/questions/7599/create-a-in-memory-table-in-spark-and-insert-data.html

dhyun · ‎03-09-2017

It should work like the following.

scala> Seq((1, 2)).toDF("i", "j").write.mode("overwrite").saveAsTable("t1")
scala> Seq((3, 4)).toDF("j", "i").write.mode("append").saveAsTable("t1")
scala> sql("select * from t1").show
+---+---+
|  i|  j|
+---+---+
|  1|  2|
|  4|  3|
+---+---+

According to your error message. The existing table `table` is not a hive table. Maybe, you created that table with that name by using `registerTempView` before. To create a table initially, use `saveAsTable` or `sql('CREATE TABLE...')` instead.

Cloudera Community

Support Questions

Appending to hive table gives an error but overwriting works why? error org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed