- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark read and overwrtite hive table
- Labels:
-
Apache Hive
-
Apache Spark
Created ‎11-09-2017 06:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Team,
I have a requirement to read an existing hive table, massage few columns and overwrite back the same hive table.
Below is the code
lp=hc.sql('select * from logistics_prd')
adt=hc.sql('select * from senty_audit.maintable')
cmb_data=adt.unionAll(lp)
cdc_data=cmb_data.distinct()
cdc_data.write.mode('overwrite').saveAsTable('senty_audit.temptable')
In step 2 I am reading senty_audit.maintable from hive. Then I am joining with other dataframe, in last step I am trying to load back(OVERWRITE) to same hive table.
In this case spark is throwing an error 'org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from'.
Can you please help me understand how should I proceed in this scenario.
Created ‎11-20-2017 06:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to save the new data to a temp table and then read from that and overwrite into hive table.
cdc_data.write.mode("overwrite").saveAsTable("temp_table")
Then you can overwrite rows in your target table
val dy = sqlContext.table("temp_table") dy.write.mode("overwrite").insertInto("senty_audit.temptable")
Created ‎12-02-2017 02:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Viswa - Did this help?
