Support Questions

Find answers, ask questions, and share your expertise

Pig Error in Lab 3: Pig - Risk Factor

Expert Contributor


using ambari and HDP 2.5

I'm trying to execute the first lab instruction : a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

I add the following argument to let Pig know the HCatLoader() class : -useHCatalog

I get the following log :

can any one help me to fix this? thanks.

 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory 
 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory 
 WARNING: Use "yarn jar" to launch YARN applications. 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Apache Pig version (rexported) compiled Nov 30 2016, 02:28:11 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/pig_1483003717522.log 
 2016-12-29 10:28:38,970 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 
 2016-12-29 10:28:39,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs:// 
 2016-12-29 10:28:41,059 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-9b551f9a-3393-4ab2-93ea-de21982a11cc 
 2016-12-29 10:28:42,237 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: 
 2016-12-29 10:28:42,704 [main] INFO  org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 
 2016-12-29 10:28:44,448 [main] WARN  org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 
 2016-12-29 10:28:44,521 [main] INFO  hive.metastore - Trying to connect to metastore with URI thrift:// 
 2016-12-29 10:28:44,588 [main] INFO  hive.metastore - Connected to metastore. 
 2016-12-29 10:28:45,238 [main] INFO  org.apache.pig.Main - Pig script completed in 8 seconds and 278 milliseconds (8278 ms) 

Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;

View solution in original post



Can you please share the error, if you scroll down further you will find the exact error. Also see if you are getting same exception mentioned here


new to Hortonworks. Following the instruction, and practiced lab3 - pig script. the copied script was completed with an error:

2017-03-09 12:14:30,576 [main] ERROR - ERROR 0:

as a result, the riskfactor table is empty.

any suggestions?


Following Wael Horchani's script to recreate the table and run the pig script, it was fixed. thx.

Expert Contributor
@milind pandit

My problem is not linked to data type. Please find enclosed the entire log file.job-1482423183850-0021-logs.txt


If your script is executing correctly , you can safely ignore these warnings.

Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;