Created 04-19-2017 01:33 PM
I wonder if I can store the output of pig jobs (the part files) with specific file name other than the default filename such as part-v000-o000-r-00000.deflate. For example when I execute: "store final_result INTO '/data/output' USING PigStorage(',');" , the output is stored on HDFS as /data/output/part-v000-o000-r-00000.deflate I want the output to look like is /data/output.csv or /data/output/output.csv
That is, to rename the "part-*" filename on the fly.
How can I achieve this in Pig?
Created 04-19-2017 06:47 PM
Its not possible to name your output file name on the fly. Atleast as of now. But rather once the files are loaded into a directory use hadoop fs -getmerge target_file_directory to merge all the files to name it rather than naming each file.
Or you can read the file hadoop fs -cat filedirectory/* > file.txt and then copy it to HDFS.
Note: By following this approach you will be merging all the file into one single file.
Created 04-20-2017 01:29 PM
Thank you @Bala Vignesh N V for your answer.
Created 04-20-2017 01:43 PM
@Kibrom Gebrehiwot If it helps you then please accept the answer. Thanks! Happy Hadooping!