Reply
New Contributor
Posts: 4
Registered: ‎11-05-2017

Inserting data from a dataframe to an existing Hive Table- append mode

Hi Everyone,

 

I have a basic question. While inserting data from a dataframe to an existing Hive Table.

 

I am using like in pySpark, which is always adding new data into table. (works fine as per requirement)

 

df.write.insertInto(table)

 but as per Spark docs, it's mentioned I should use command as 

df.write.mode("append").insertInto("table")

 Is it necessary to use mode("append") ?

Posts: 390
Topics: 11
Kudos: 60
Solutions: 35
Registered: ‎09-02-2016

Re: Inserting data from a dataframe to an existing Hive Table- append mode

[ Edited ]

@gaurav796

 

The difference is

 

 

insertInto: To overwrite any existing data

 

Mode comes with additional options, like 

 

mode("append"):  Append contents of this DataFrame to existing data
mode("overwrite:): Overwrite existing data.

 

Note: I didn't get a chance to explore this before reply

New Contributor
Posts: 4
Registered: ‎11-05-2017

Re: Inserting data from a dataframe to an existing Hive Table- append mode

Thank you for your response. but when I use only insertInto(table)... it always inserts new data into table.

Without deleting or overwriting anything. Which was strange for me. That's why I asked.

 

May be only using insertInto by default does append ??

Announcements