Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I need help with the riskfactor pig script from the HDP 2.5 tutorial.

avatar

Hello,

I am stepping through this part of the HDP 2.5 tutorial:

https://github.com/hortonworks/tutorials/blob/hdp-2.5/tutorials/hortonworks/hello-hdp-an-introductio...

I have executed this statement in the Hive view in Ambari under maria_dev:

CREATE TABLE riskfactor (driverid string,events bigint,totmiles bigint,riskfactor float) STORED AS ORC;

I have checked the table to be present in the default db and it is there.

After executing the following pig script:

a = LOAD 'geolocation' using org.apache.hive.hcatalog.pig.HCatLoader();

b = filter a by event != 'normal';

c = foreach b generate driverid, event, (int) '1' as occurance;

d = group c by driverid;

e = foreach d generate group as driverid, SUM(c.occurance) as t_occ;

g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();

h = join e by driverid, g by driverid;

final_data = foreach h generate $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor;

store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();

I get the following errors:

ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory WARNING: Use "yarn jar" to launch YARN applications. 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.5.0.0-1245 (rexported) compiled Aug 26 2016, 02:07:35 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:23,260 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 2016-09-27 11:51:23,453 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020 2016-09-27 11:51:24,818 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-8ca435c7-920a-4f44-953e-454a42973ab8 2016-09-27 11:51:25,478 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-09-27 11:51:25,671 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 2016-09-27 11:51:27,037 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,107 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,170 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:27,904 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,906 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,909 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,140 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_FLOAT 1 time(s). 2016-09-27 11:51:28,237 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:28,317 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:28,325 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,723 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: <file script.pig, line 9, column 0> Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms)

When I executed the script for the very first time I did not see any errors, but the riskfactor table was still empty and should have been populated.

Is there somebody that can help?

1 ACCEPTED SOLUTION

avatar
Super Guru

@Robbert Naastepad

It looks like there is a data type mismatch according to the error:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: 
  
    Output Location Validation Failed for: 'riskfactor More info to 
follow: Pig 'double' type in column 2(0-based) cannot map to HCat 
'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at 
logfile: 
/hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log
 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script 
completed in 7 seconds and 330 milliseconds (7330 ms

The log indicates that it's attempting to store a DOUBLE into a target column that should be a BIGINT. It saying "in column 2(0-based)", so the problem is with totmiles.

View solution in original post

10 REPLIES 10

avatar

Hi,

I am not sure of what I did but it works: I dropped table RISKFACTOR in order to create it like this

driverid STRING

events BIGINT

totmiles DOUBLE

riskfactor FLOAT

then I run the script in the shell not in AMBARI and it worked (even with some warnings..)

pig -useHCatalog -f hdfs://sandbox.hortonworks.com/tmp/.pigscripts/riskfactoradmin-2016-11-28_06-42.pig

Hope it will help you.