Support Questions

Find answers, ask questions, and share your expertise

Spark read and overwrtite hive table

avatar
Super Collaborator

Hi Team,

I have a requirement to read an existing hive table, massage few columns and overwrite back the same hive table.

Below is the code

lp=hc.sql('select * from logistics_prd')
adt=hc.sql('select * from senty_audit.maintable')
cmb_data=adt.unionAll(lp)
cdc_data=cmb_data.distinct()
cdc_data.write.mode('overwrite').saveAsTable('senty_audit.temptable')

In step 2 I am reading senty_audit.maintable from hive. Then I am joining with other dataframe, in last step I am trying to load back(OVERWRITE) to same hive table.

In this case spark is throwing an error 'org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from'.

Can you please help me understand how should I proceed in this scenario.

2 REPLIES 2

avatar

You need to save the new data to a temp table and then read from that and overwrite into hive table.

cdc_data.write.mode("overwrite").saveAsTable("temp_table")

Then you can overwrite rows in your target table

val dy = sqlContext.table("temp_table")
dy.write.mode("overwrite").insertInto("senty_audit.temptable")

avatar

@Viswa - Did this help?