Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Pig Error in Lab 3: Pig - Risk Factor

avatar
Expert Contributor

Hi,

using ambari 2.4.1.0 and HDP 2.5

I'm trying to execute the first lab instruction : a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

I add the following argument to let Pig know the HCatLoader() class : -useHCatalog

I get the following log :

can any one help me to fix this? thanks.

 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory 
 ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory 
 WARNING: Use "yarn jar" to launch YARN applications. 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 
 16/12/29 10:28:37 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Apache Pig version 0.16.0.2.5.3.0-37 (rexported) compiled Nov 30 2016, 02:28:11 
 2016-12-29 10:28:37,537 [main] INFO  org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/admin/appcache/application_1482423183850_0022/container_1482423183850_0022_01_000002/pig_1483003717522.log 
 2016-12-29 10:28:38,970 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 
 2016-12-29 10:28:39,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://vds002.databridge.tn:8020 
 2016-12-29 10:28:41,059 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-9b551f9a-3393-4ab2-93ea-de21982a11cc 
 2016-12-29 10:28:42,237 [main] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://vds002.databridge.tn:8188/ws/v1/timeline/ 
 2016-12-29 10:28:42,704 [main] INFO  org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 
 2016-12-29 10:28:44,448 [main] WARN  org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 
 2016-12-29 10:28:44,521 [main] INFO  hive.metastore - Trying to connect to metastore with URI thrift://vds002.databridge.tn:9083 
 2016-12-29 10:28:44,588 [main] INFO  hive.metastore - Connected to metastore. 
 2016-12-29 10:28:45,238 [main] INFO  org.apache.pig.Main - Pig script completed in 8 seconds and 278 milliseconds (8278 ms) 
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;

View solution in original post

6 REPLIES 6

avatar

Can you please share the error, if you scroll down further you will find the exact error. Also see if you are getting same exception mentioned here https://community.hortonworks.com/questions/59172/errors-on-lab-3.html

avatar
New Member

new to Hortonworks. Following the instruction, and practiced lab3 - pig script. the copied script was completed with an error:

2017-03-09 12:14:30,576 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0:

as a result, the riskfactor table is empty.

any suggestions?

avatar
New Member

Following Wael Horchani's script to recreate the table and run the pig script, it was fixed. thx.

avatar
Expert Contributor
@milind pandit

My problem is not linked to data type. Please find enclosed the entire log file.job-1482423183850-0021-logs.txt

avatar

If your script is executing correctly , you can safely ignore these warnings.

avatar
Expert Contributor

Some errors in lab -->

  • Pig Script must be as follow:

a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();

b = FILTER a BY event != 'normal';

c = FOREACH b GENERATE driverid, (int) '1' as occurance;

d = GROUP c BY driverid;

e = FOREACH d GENERATE group as driverid, SUM(c.occurance) as totevents;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as totevents, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

  • riskfactor table in Hive must be as follow:

CREATE TABLE riskfactor (driverid string,totevents bigint,totmiles double,riskfactor float) STORED AS ORC;