Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Pig Error in Lab 3: Pig - Risk Factor

avatar
Expert Contributor

Hi,

using ambari 2.4.1.0 and HDP 2.5

I'm trying to execute the first lab instruction : a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

I add the following argument to let Pig know the HCatLoader() class : -useHCatalog

I get the following log :

can any one help me to fix this? thanks.

 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory 
 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory 
 WARNING: Use "yarn jar" to launch YARN applications. 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Apache Pig version 0.16.0.2.5.3.0-37 (rexported) compiled Nov 30 2016, 02:28:11 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/pig_1483003717522.log 
 2016-12-29 10:28:38,970 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 
 2016-12-29 10:28:39,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://vds002.databridge.tn:8020 
 2016-12-29 10:28:41,059 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-9b551f9a-3393-4ab2-93ea-de21982a11cc 
 2016-12-29 10:28:42,237 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://vds002.databridge.tn:8188/ws/v1/timeline/ 
 2016-12-29 10:28:42,704 [main] INFO  org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 
 2016-12-29 10:28:44,448 [main] WARN  org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 
 2016-12-29 10:28:44,521 [main] INFO  hive.metastore - Trying to connect to metastore with URI thrift://vds002.databridge.tn:9083 
 2016-12-29 10:28:44,588 [main] INFO  hive.metastore - Connected to metastore. 
 2016-12-29 10:28:45,238 [main] INFO  org.apache.pig.Main - Pig script completed in 8 seconds and 278 milliseconds (8278 ms) 
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;

View solution in original post

6 REPLIES 6

avatar

Can you please share the error, if you scroll down further you will find the exact error. Also see if you are getting same exception mentioned here https://community.hortonworks.com/questions/59172/errors-on-lab-3.html

avatar
Explorer

new to Hortonworks. Following the instruction, and practiced lab3 - pig script. the copied script was completed with an error:

2017-03-09 12:14:30,576 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0:

as a result, the riskfactor table is empty.

any suggestions?

avatar
Explorer

Following Wael Horchani's script to recreate the table and run the pig script, it was fixed. thx.

avatar
Expert Contributor
@milind pandit

My problem is not linked to data type. Please find enclosed the entire log file.job-1482423183850-0021-logs.txt

avatar

If your script is executing correctly , you can safely ignore these warnings.

avatar
Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;