Hi All/ @Shu,
In my project duplicates are creating while saving the records in some random cases. so, we written few queries as below to remove the duplicates.
Step 1:
create view db1.temp_no_duplicates as select distinct * from db2.main_table_with_duplicates;
creating a temp table on main table and save records in the temp table by applying distinct condition on primary keys and executed this query using hive context.
Step 2:
insert overwrite table db2.main_table_with_duplicates select * from db1.temp_no_duplicates;
Overwriting the main table with records in temp table.
While we are executing this we are facing an error :
org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;
Is it possible to overwrite like this?
Thank You.