Support Questions
Find answers, ask questions, and share your expertise

Pig Error while running script and loading data into a already created table in hive

The script is emp.pig -useHCatalog

A = LOAD '/user/maria_dev/empdata' using PigStorage(',') AS (ename:chararray, esal:float, eid:float);
B = FILTER A BY ($1 matches 'N/A') and ($2 matches 'Null');
STORE B INTO 'emp' USING org.apache.hive.hcatalog.pig.HCatStorer();
4 REPLIES 4

Super Guru

@sanjeevan mahajan - Can you please share the complete error stack trace ?

 WARNING: Use "yarn jar" to launch YARN applications. 
 16/05/07 07:21:39 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 
 16/05/07 07:21:39 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 
 16/05/07 07:21:39 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType 
 2016-05-07 07:21:39,421 [main] INFO  org.apache.pig.Main - Apache Pig version 0.15.0.2.4.0.0-169 (rexported) compiled Feb 10 2016, 07:50:04 
 2016-05-07 07:21:39,422 [main] INFO  org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1462448182529_0012/container_e13_1462448182529_0012_01_000002/pig_1462605699419.log 
 2016-05-07 07:21:41,535 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 
 2016-05-07 07:21:41,930 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020 
 2016-05-07 07:21:44,496 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-709e50a5-f7c9-45fe-8730-dd065b39da0e 
 2016-05-07 07:21:45,331 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 
 2016-05-07 07:21:45,898 [main] INFO  org.apache.pig.backend.hadoop.ATSService - Created ATS Hook 
 2016-05-07 07:21:46,931 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Pig script failed to parse:  
 <file script.pig, line 2, column 22> Invalid scalar projection: A : A column needs to be projected from a relation for it to be used as a scalar 
 Failed to parse: Pig script failed to parse:  
 <file script.pig, line 2, column 22> Invalid scalar projection: A : A column needs to be projected from a relation for it to be used as a scalar 
 	at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:199) 
 	at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1776) 
 	at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1484) 
 	at org.apache.pig.PigServer.parseAndBuild(PigServer.java:428) 
 	at org.apache.pig.PigServer.executeBatch(PigServer.java:453) 
 	at org.apache.pig.PigServer.executeBatch(PigServer.java:439) 
 	at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171) 
 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234) 
 	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) 
 	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) 
 	at org.apache.pig.Main.run(Main.java:502) 
 	at org.apache.pig.Main.main(Main.java:177) 
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
 	at java.lang.reflect.Method.invoke(Method.java:606) 
 	at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
 	at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
 Caused by:  
 <file script.pig, line 2, column 22> Invalid scalar projection: A : A column needs to be projected from a relation for it to be used as a scalar 
 	at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:10973) 
 	at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:10190) 
 	at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:10078) 
 	at org.apache.pig.parser.LogicalPlanGenerator.cond(LogicalPlanGenerator.java:8583) 
 	at org.apache.pig.parser.LogicalPlanGenerator.cond(LogicalPlanGenerator.java:8390) 
 	at org.apache.pig.parser.LogicalPlanGenerator.cond(LogicalPlanGenerator.java:8390) 
 	at org.apache.pig.parser.LogicalPlanGenerator.filter_clause(LogicalPlanGenerator.java:8110) 
 	at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1691) 
 	at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) 
 	at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) 
 	at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) 
 	at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) 
 	... 17 more 
 2016-05-07 07:21:46,941 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse:  
 <file script.pig, line 2, column 22> Invalid scalar projection: A : A column needs to be projected from a relation for it to be used as a scalar 
 Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1462448182529_0012/container_e13_1462448182529_0012_01_000002/pig_1462605699419.log 
 2016-05-07 07:21:47,005 [main] INFO  org.apache.pig.Main - Pig script completed in 8 seconds and 160 milliseconds (8160 ms) 

@Kuldeep Kulkarni I've shared

Hi @sanjeevan mahajan,

The problem is in you second line. The matches is doing a regex and can be applied to CharArray only. In your case $1 and $2 are floats that's why your script is not working.

B = FILTER A BY ($1 matches 'N/A') and ($2 matches 'Null');

Use "is null" operator instead : https://pig.apache.org/docs/r0.13.0/basic.html#null_operators

Hope this helps