Support Questions

Find answers, ask questions, and share your expertise

Writing parquet file to hdfs for internal Hive table

avatar
Contributor

I'm trying to overwrite a parquet file in hdfs for a Hive table, but I noticed a bunch of individual parquet files with unique names like this, 

table_directory /part-00199-edf111ef-0f7a-43ab-9393-8fc620195d4c-c000.snappy.parquet

 

How do I add a parquet file to a Hive table directory that's composed of several parquet file?

 

 

2 ACCEPTED SOLUTIONS

avatar
Super Guru

@Wilber My suggestion would be to not over write existing files, but write to staging location as external table.  Then merge staging location data into final internal hive table with INSERT INTO final_table SELECT * FROM staging_table;

View solution in original post

avatar
Contributor

@stevenmatison  - perfect; works! Thank you.

View solution in original post

3 REPLIES 3

avatar
Super Guru

@Wilber My suggestion would be to not over write existing files, but write to staging location as external table.  Then merge staging location data into final internal hive table with INSERT INTO final_table SELECT * FROM staging_table;

avatar
Contributor

@stevenmatison  - perfect; works! Thank you.

avatar
Super Guru

Please accept the solution as answer.  Doin this helps complete the solution.