Hi @yogesh turkane, As I was across, We can achieve this with two ways. Post the load of the data or with schedule intervals run the "ALTER TABLE <table_name> CONCATENATE" on the table in SQL api this will merge all the small orc files associated to that table. - Please not that this is specific to ORC Use the data frame to load the data and re-partition write back with overwrite in spark. The code snippet would be val tDf = hiveContext.table("table_name")
tdf.rePartition(<num_Files>).write.mode("overwrite").saveAsTable("targetDB.targetTbale") the second option will work with any type of files. Hope this helps !!
... View more