Created 03-06-2017 09:48 PM
When i do
dfTrimmed.write.mode("overwrite").saveAsTable("table") it worksbut this dfTrimmed.write.mode("append").saveAsTable("table") gives an error
error org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed
I am not sure why this is I am using spark 1.6.
I am inserting into a hive table while my dataframe was created through a hiveContext.
Thank you
Created 03-08-2017 07:00 PM
Hive is not like a traditional RDBMS in regard to DML operations because of how Hive leverages HDFS to store data in files. Keep in mind that each partition has a file, each bucket adds another file and so on. When you perform a DML action against of a row, you practically overwrite a file, not append to a file. This is how HDFS has been architected for good reasons.
Created 03-09-2017 01:17 PM
seems this doc might help you.
https://forums.databricks.com/questions/7599/create-a-in-memory-table-in-spark-and-insert-data.html
Created 03-09-2017 06:34 PM
It should work like the following.
scala> Seq((1, 2)).toDF("i", "j").write.mode("overwrite").saveAsTable("t1") scala> Seq((3, 4)).toDF("j", "i").write.mode("append").saveAsTable("t1") scala> sql("select * from t1").show +---+---+ | i| j| +---+---+ | 1| 2| | 4| 3| +---+---+
According to your error message. The existing table `table` is not a hive table. Maybe, you created that table with that name by using `registerTempView` before. To create a table initially, use `saveAsTable` or `sql('CREATE TABLE...')` instead.