Support Questions

Find answers, ask questions, and share your expertise

AvroStorage - output file name definition

avatar
Expert Contributor

favorite

I use AvroStorage to store result set from the pig. Is there a way how can I store data into one specified avro file...e.g OutputFileGen1? Pig is storing data into the directory named OutpuFileGen1 with structure as listed below:

 ls -al  OutputFileGen1/
total 20
drwxr-xr-x 2 root root 4096 2016-01-18 14:35 .
drwxr-xr-x 6 root root 4096 2016-01-19 10:27 ..
-rw-r--r-- 1 root root 4083 2016-01-18 14:35 part-m-00000.avro
-rw-r--r-- 1 root root   40 2016-01-18 14:35 .part-m-00000.avro.crc
-rw-r--r-- 1 root root    0 2016-01-18 14:35 _SUCCESS
-rw-r--r-- 1 root root    8 2016-01-18 14:35 ._SUCCESS.crc

Thank you

http://stackoverflow.com/questions/34880880/avrostorage-output-file-name-definition

1 ACCEPTED SOLUTION

avatar
Expert Contributor

ok works on local FS also.

grunt> fs -getmerge dir file

View solution in original post

4 REPLIES 4

avatar
Master Mentor

That option is available in Java mapreduce but in Pig it is not available. From the stackoverflow example, suggestion is to have a follow up hdfs command to rename the file tp desired name. Pig fully supports hdfs commands as part of scripts. @John Smith

avatar
Expert Contributor

Hi, what do you mean by Java mapreduce? Directly into Java mapreduce code?

Im storing results into normal FS, can i use hdfs commands on files/directories stored on normal FS?

avatar
Master Mentor

@John Smith

correct, you can override the output with multipleoutputs and define path as you wish in Java. In Pig it is not possible, perhaps you'd like to open an enhancement Jira? To store results into normal FS, you need to launch the script in local mode or specify full path file:///path. Vice versa for Tez/MR mode, then you specify hdfs:// in local mode for hdfs FS.

avatar
Expert Contributor

ok works on local FS also.

grunt> fs -getmerge dir file