Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Insert overwrite with in the same table in spark.

Insert overwrite with in the same table in spark.

New Contributor

Hi All/ @Shu,


In my project duplicates are creating while saving the records in some random cases. so, we written few queries as below to remove the duplicates.

Step 1:

create view db1.temp_no_duplicates as select distinct * from db2.main_table_with_duplicates;

creating a temp table on main table and save records in the temp table by applying distinct condition on primary keys and executed this query using hive context.

Step 2:

insert overwrite table db2.main_table_with_duplicates select * from db1.temp_no_duplicates;

Overwriting the main table with records in temp table.


While we are executing this we are facing an error :

org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;


Is it possible to overwrite like this?


Thank You.


1 REPLY 1

Re: Insert overwrite with in the same table in spark.

Super Guru

@Veera Pavan

This job will work fine in Hive but in Spark follow these steps:

  1. write the data to temporary table first.
  2. then select from temporary table
  3. insert overwrite the final table.

Check this similar thread regards to similar case.

-

If the answer is helpful to resolve the issue, Login and Click on Accept button below to close this thread.This will help other community users to find answers quickly :-)

Don't have an account?
Coming from Hortonworks? Activate your account here