When we insert data from staging table into a production table using dynamic partition inserts, the files created at the partition directory are like: 0000_0.
However, say, for a process where data is loaded on a daily basis, after the first data insertion in a partition, the file names are like 0000_0_copy_1 for the second day, 0000_0_copy_2 for the third day and so on...
I want to create a filename like so: partitionName_datestamp [ex. IND_20173107] so that it helps to maintain a logical and relevant file structure for any manual intervention needs.
I am aware that we can achieve this by executing a shell script after Hive jobs.
But, can we control this from within Hive?
PS: I am using Cloudera 5.8. Hive table backed as parquet.
@anirbandd You can write your own custom reducer class things like LazyOutputFormat ,etc
i believe there is no property that you can tweak in your mapred or hive xml for your custom file output format while performing in hive
Were you able to set custom prefix ? I want to do multiple inserts into same partition. I hope if custom prefix works, I can do multiple inserts in hive table. Any suggestions appreciated