Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark read and overwrtite hive table

avatar
Super Collaborator

Hi Team,

I have a requirement to read an existing hive table, massage few columns and overwrite back the same hive table.

Below is the code

lp=hc.sql('select * from logistics_prd')
adt=hc.sql('select * from senty_audit.maintable')
cmb_data=adt.unionAll(lp)
cdc_data=cmb_data.distinct()
cdc_data.write.mode('overwrite').saveAsTable('senty_audit.temptable')

In step 2 I am reading senty_audit.maintable from hive. Then I am joining with other dataframe, in last step I am trying to load back(OVERWRITE) to same hive table.

In this case spark is throwing an error 'org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from'.

Can you please help me understand how should I proceed in this scenario.

2 REPLIES 2

avatar

You need to save the new data to a temp table and then read from that and overwrite into hive table.

cdc_data.write.mode("overwrite").saveAsTable("temp_table")

Then you can overwrite rows in your target table

val dy = sqlContext.table("temp_table")
dy.write.mode("overwrite").insertInto("senty_audit.temptable")

avatar

@Viswa - Did this help?