- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Inserting data from a dataframe to an existing Hive Table- append mode
- Labels:
-
Apache Hive
-
Apache Spark
Created on ‎11-05-2017 07:57 AM - edited ‎09-16-2022 05:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Everyone,
I have a basic question. While inserting data from a dataframe to an existing Hive Table.
I am using like in pySpark, which is always adding new data into table. (works fine as per requirement)
df.write.insertInto(table)
but as per Spark docs, it's mentioned I should use command as
df.write.mode("append").insertInto("table")
Is it necessary to use mode("append") ?
Created on ‎11-06-2017 07:38 AM - edited ‎11-06-2017 07:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The difference is
insertInto: To overwrite any existing data
Mode comes with additional options, like
mode("append"): Append contents of this DataFrame to existing data mode("overwrite:): Overwrite existing data.
Note: I didn't get a chance to explore this before reply
Created ‎11-06-2017 04:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your response. but when I use only insertInto(table)... it always inserts new data into table.
Without deleting or overwriting anything. Which was strange for me. That's why I asked.
May be only using insertInto by default does append ??
Created ‎08-08-2018 11:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i dont think so you can alter the existing table as the database is immutable
Created ‎09-20-2021 04:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To append data frames in R, use the rbind() function. The rbind() is a built-in R function that can combine several vectors, matrices, and/or data frames by rows.
When it comes to appending data frames, the rbind() and cbind() function comes to mind because they can concatenate the data frames horizontally and vertically. In this example, we will see how to use the rbind() function to append data frames.
