Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark read and overwrtite hive table

Highlighted

Spark read and overwrtite hive table

Expert Contributor

Hi Team,

I have a requirement to read an existing hive table, massage few columns and overwrite back the same hive table.

Below is the code

lp=hc.sql('select * from logistics_prd')
adt=hc.sql('select * from senty_audit.maintable')
cmb_data=adt.unionAll(lp)
cdc_data=cmb_data.distinct()
cdc_data.write.mode('overwrite').saveAsTable('senty_audit.temptable')

In step 2 I am reading senty_audit.maintable from hive. Then I am joining with other dataframe, in last step I am trying to load back(OVERWRITE) to same hive table.

In this case spark is throwing an error 'org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from'.

Can you please help me understand how should I proceed in this scenario.

2 REPLIES 2

Re: Spark read and overwrtite hive table

You need to save the new data to a temp table and then read from that and overwrite into hive table.

cdc_data.write.mode("overwrite").saveAsTable("temp_table")

Then you can overwrite rows in your target table

val dy = sqlContext.table("temp_table")
dy.write.mode("overwrite").insertInto("senty_audit.temptable")

Re: Spark read and overwrtite hive table

@Viswa - Did this help?

Don't have an account?
Coming from Hortonworks? Activate your account here