Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig script error

Highlighted

Pig script error

New Contributor

Hi All

When I run this Pig Script, it give and error

Script :

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();
b = FILTER a BY driverid, g by driverid;event !=’normal’;
c = FOREACH b GENERATE driverid, event, (int) '1' as occurance;
d = GROUP c BY driverid;
e = foreach d generate group as driverid, SUM(c.occurance) as t_occ;
g = LOAD drivermileage USING org.apache.hive.hcatalog.pig.HCatLoader();
h = JOIN e BY driverid, g by driverid;
final_data = FOREACH h GENERATE $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor;
store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

Error :

ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1508940571320_0017/container_1508940571320_0017_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory

ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1508940571320_0017/container_1508940571320_0017_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory

17/10/26 03:40:21 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL

17/10/26 03:40:21 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE

17/10/26 03:40:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL

17/10/26 03:40:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ

17/10/26 03:40:21 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType

2017-10-26 03:40:21,154 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.6.1.0-129 (rexported) compiled May 31 2017, 03:39:20

2017-10-26 03:40:21,154 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/admin/appcache/application_1508940571320_0017/container_1508940571320_0017_01_000002/pig_1508989221150.log

2017-10-26 03:40:22,237 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found

2017-10-26 03:40:22,456 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020

2017-10-26 03:40:23,938 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-ea1809b3-d311-4906-a770-bb298516bc24

2017-10-26 03:40:24,630 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/

2017-10-26 03:40:24,837 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook

2017-10-26 03:40:24,915 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " <IDENTIFIER> "event "" at line 2, column 41.

Was expecting one of:

<EOF>

"cat" ...

"clear" ...

"fs" ...

"sh" ...

"cd" ...

"cp" ...

"copyFromLocal" ...

"copyToLocal" ...

"dump" ...

"\\d" ...

"describe" ...

"\\de" ...

"aliases" ...

"explain" ...

"\\e" ...

"help" ...

"history" ...

"kill" ...

"ls" ...

"mv" ...

"mkdir" ...

"pwd" ...

"quit" ...

"\\q" ...

"register" ...

"rm" ...

"rmf" ...

"set" ...

"illustrate" ...

"\\i" ...

"run" ...

"exec" ...

"%default" ...

"%declare" ...

"scriptDone" ...

"" ...

"" ...

<EOL> ...

";" ...

Details at logfile: /hadoop/yarn/local/usercache/admin/appcache/application_1508940571320_0017/container_1508940571320_0017_01_000002/pig_1508989221150.log

2017-10-26 03:40:24,949 [main] INFO org.apache.pig.Main - Pig script completed in 4 seconds and 215 milliseconds (4215 ms)

2 REPLIES 2

Re: Pig script error

Expert Contributor

Hi @Vijay Gorania

There is an error is hive creation command. totmiles must be double instead of bigint. Please use following steps and try to run the pig script again.

drop table riskfactor;
CREATE TABLE riskfactor (driverid string,events bigint,totmiles double,riskfactor float) STORED AS ORC;

Please check this post for further detail:

https://community.hortonworks.com/questions/58614/i-need-help-with-the-riskfactor-pig-script-from-th...

Highlighted

Re: Pig script error

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();
b = FILTER a BY driverid, g by driverid;event !=’normal’;

1. geolocation is this path or variable if its a variable use $geolocation

2. Filter can only have one ´BY´ and subsequently use AND OR operator to have multiple condition checked.

Don't have an account?
Coming from Hortonworks? Activate your account here