Created 04-12-2018 06:27 AM
I am using PIG to read a file and want to pass those data and using `SPLIT` I want to split data and below is my input file;
1,aaa,123456,annotation 2,bbb,234567,barber 4,ddd,456789,federal 3,ccc,345678,code 4,ddd,456789,definition 5,asd,545645,AcsToGlRestServices 6,date,58314,filterlevel 7,kssa,22334,timefield 8,Bhi,2236,context
I executed following pig script. Below is my PIG script.
grunt> rawlvl = load '~/file' using PigStorage(',') as (no:int,name:chararray,phno:int,add:chararray); grunt> splitlvl = SPLIT rawlvl into one if (no>2 and no<5),two if (no>5);
But I am getting an exception, kindly help me why I am getting this exception. Below is the exception;
2018-04-12 06:02:19,718 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 3, column 0> Syntax error, unexpected symbol at or near 'splitlvl' Details at logfile: /root/pig_1523510819542.log
And following below I have pasted `pig_1523510819542.log` file.
================================================================================ Pig Stack Trace --------------- ERROR 1200: <line 4, column 0> Syntax error, unexpected symbol at or near 'splitlvl' Failed to parse: <line 4, column 0> Syntax error, unexpected symbol at or near 'splitlvl' at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:244) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1791) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1764) at org.apache.pig.PigServer.registerQuery(PigServer.java:707) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1075) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) at org.apache.pig.Main.run(Main.java:566) at org.apache.pig.Main.main(Main.java:178) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) ================================================================================
Kindly help me out here, as I am new to HADOOP and PIG.
Created 04-12-2018 11:27 AM
@JAy PaTel
The issue is with you are storing the split results into splitlvl relation, However by using split function we are splitting out rawlvl relation into one,two relations and then you are keeping the results into splitvl relation.
grunt> splitlvl = SPLIT rawlvl into one if(no>2andno<5),two if(no>5);
Storing split function results into another relation(splitvl), is not a valid syntax for split function in pig
Change your script to
grunt> rawlvl = load '~/file'usingPigStorage(',')as(no:int,name:chararray,phno:int,add:chararray);
grunt> SPLIT rawlvl into one if(no>2andno<5),two if(no>5); grunt> dump one;
grunt> dump two;
For more details about split function please refer to below link.
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#SPLIT
Example:-
Step1:-
Loaded input file into pig
grunt> rawlvl = load '/t.txt' using PigStorage(',') as (no:int,name:chararray,phno:int,add:chararray);
grunt> dump rawlvl
(1,aaa,123456,annotation) (2,bbb,234567,barber) (4,ddd,456789,federal) (3,ccc,345678,code) (4,ddd,456789,definition) (5,asd,545645,AcsToGlRestServices) (6,date,58314,filterlevel) (7,kssa,22334,timefield) (8,Bhi,2236,context)
data is loaded into rawlvl relation.
Step2:-
Now split rawlvl relation into two relations i.e one,two
grunt> SPLIT rawlvl into one if (no>2 and no<5),two if (no>5);
Dump one relation
grunt> dump one; (4,ddd,456789,federal) (3,ccc,345678,code) (4,ddd,456789,definition)
Dump two relation
grunt> dump two; (6,date,58314,filterlevel) (7,kssa,22334,timefield) (8,Bhi,2236,context)
As you can view the output of one,two relations matching with your conditions specified.
.
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created 04-12-2018 11:27 AM
@JAy PaTel
The issue is with you are storing the split results into splitlvl relation, However by using split function we are splitting out rawlvl relation into one,two relations and then you are keeping the results into splitvl relation.
grunt> splitlvl = SPLIT rawlvl into one if(no>2andno<5),two if(no>5);
Storing split function results into another relation(splitvl), is not a valid syntax for split function in pig
Change your script to
grunt> rawlvl = load '~/file'usingPigStorage(',')as(no:int,name:chararray,phno:int,add:chararray);
grunt> SPLIT rawlvl into one if(no>2andno<5),two if(no>5); grunt> dump one;
grunt> dump two;
For more details about split function please refer to below link.
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#SPLIT
Example:-
Step1:-
Loaded input file into pig
grunt> rawlvl = load '/t.txt' using PigStorage(',') as (no:int,name:chararray,phno:int,add:chararray);
grunt> dump rawlvl
(1,aaa,123456,annotation) (2,bbb,234567,barber) (4,ddd,456789,federal) (3,ccc,345678,code) (4,ddd,456789,definition) (5,asd,545645,AcsToGlRestServices) (6,date,58314,filterlevel) (7,kssa,22334,timefield) (8,Bhi,2236,context)
data is loaded into rawlvl relation.
Step2:-
Now split rawlvl relation into two relations i.e one,two
grunt> SPLIT rawlvl into one if (no>2 and no<5),two if (no>5);
Dump one relation
grunt> dump one; (4,ddd,456789,federal) (3,ccc,345678,code) (4,ddd,456789,definition)
Dump two relation
grunt> dump two; (6,date,58314,filterlevel) (7,kssa,22334,timefield) (8,Bhi,2236,context)
As you can view the output of one,two relations matching with your conditions specified.
.
If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created 04-12-2018 12:03 PM
@Shu Oh! I see. Thak you so much. Those minor mistakes I didn't notice.