Created 08-11-2020 07:22 PM
I'm trying to overwrite a parquet file in hdfs for a Hive table, but I noticed a bunch of individual parquet files with unique names like this,
table_directory /part-00199-edf111ef-0f7a-43ab-9393-8fc620195d4c-c000.snappy.parquet
How do I add a parquet file to a Hive table directory that's composed of several parquet file?
Created on 08-12-2020 06:49 AM - edited 08-12-2020 06:50 AM
@Wilber My suggestion would be to not over write existing files, but write to staging location as external table. Then merge staging location data into final internal hive table with INSERT INTO final_table SELECT * FROM staging_table;
Created 08-13-2020 06:31 PM
@stevenmatison - perfect; works! Thank you.
Created on 08-12-2020 06:49 AM - edited 08-12-2020 06:50 AM
@Wilber My suggestion would be to not over write existing files, but write to staging location as external table. Then merge staging location data into final internal hive table with INSERT INTO final_table SELECT * FROM staging_table;
Created 08-13-2020 06:31 PM
@stevenmatison - perfect; works! Thank you.
Created 08-14-2020 05:55 AM
Please accept the solution as answer. Doin this helps complete the solution.