Created 03-01-2016 08:53 AM
Hi,
I was trying to load a file in Pig which contains data like :
{(3),(mary),(19)}
{(1),(john),(18)}
{(2),(joe),(18)}
Following command is falling :
A = LOAD 'data3' AS (B: bag {T: tuple(t1:int), F:tuple(f1:chararray), G:tuple(g1:int)});
How to do it in correct way ?
Thanks,
Soumya
Created 03-01-2016 09:42 AM
I don't think there is a Pig Storage handler that does that. Which is a bit weird I suppose. How did you generate that file? Just test data you did manually?
PigStorage essentially reads writes delimited files, tuples can be Maps/bags but I don't think the main record can be.
JsonStorage is Json format which is different syntax. Then there is BinStorage which I suppose is some kind of Sequence file.
I might just not see that but I think there is no way in Pig natively without some transformations to read data in the format he prints it on for debugging. Please someone correct me if I am wrong.
http://pig.apache.org/docs/r0.14.0/func.html#load-store-functions
Created 03-01-2016 09:12 AM
Created 03-01-2016 09:42 AM
I don't think there is a Pig Storage handler that does that. Which is a bit weird I suppose. How did you generate that file? Just test data you did manually?
PigStorage essentially reads writes delimited files, tuples can be Maps/bags but I don't think the main record can be.
JsonStorage is Json format which is different syntax. Then there is BinStorage which I suppose is some kind of Sequence file.
I might just not see that but I think there is no way in Pig natively without some transformations to read data in the format he prints it on for debugging. Please someone correct me if I am wrong.
http://pig.apache.org/docs/r0.14.0/func.html#load-store-functions
Created 03-01-2016 11:25 AM
Load the data using pig storage and then run tobag function http://pig.apache.org/docs/r0.15.0/func.html#tobag is it a comma separated file?
a = LOAD 'student' AS (f1:chararray, f2:int, f3:float); DUMP a; (John,18,4.0) (Mary,19,3.8) (Bill,20,3.9) (Joe,18,3.8) b = FOREACH a GENERATE TOBAG(f1,f3); DUMP b; ({(John),(4.0)}) ({(Mary),(3.8)}) ({(Bill),(3.9)}) ({(Joe),(3.8)})