- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
AvroStorage - output file name definition
- Labels:
-
Apache Pig
Created ‎01-20-2016 12:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use AvroStorage to store result set from the pig. Is there a way how can I store data into one specified avro file...e.g OutputFileGen1? Pig is storing data into the directory named OutpuFileGen1 with structure as listed below:
ls -al OutputFileGen1/
total 20
drwxr-xr-x 2 root root 4096 2016-01-18 14:35 .
drwxr-xr-x 6 root root 4096 2016-01-19 10:27 ..
-rw-r--r-- 1 root root 4083 2016-01-18 14:35 part-m-00000.avro
-rw-r--r-- 1 root root 40 2016-01-18 14:35 .part-m-00000.avro.crc
-rw-r--r-- 1 root root 0 2016-01-18 14:35 _SUCCESS
-rw-r--r-- 1 root root 8 2016-01-18 14:35 ._SUCCESS.crc
Thank you
http://stackoverflow.com/questions/34880880/avrostorage-output-file-name-definition
Created ‎01-20-2016 08:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎01-20-2016 12:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That option is available in Java mapreduce but in Pig it is not available. From the stackoverflow example, suggestion is to have a follow up hdfs command to rename the file tp desired name. Pig fully supports hdfs commands as part of scripts. @John Smith
Created ‎01-20-2016 08:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, what do you mean by Java mapreduce? Directly into Java mapreduce code?
Im storing results into normal FS, can i use hdfs commands on files/directories stored on normal FS?
Created ‎01-20-2016 01:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
correct, you can override the output with multipleoutputs and define path as you wish in Java. In Pig it is not possible, perhaps you'd like to open an enhancement Jira? To store results into normal FS, you need to launch the script in local mode or specify full path file:///path. Vice versa for Tez/MR mode, then you specify hdfs:// in local mode for hdfs FS.
Created ‎01-20-2016 08:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok works on local FS also.
grunt> fs -getmerge dir file
