Created 09-27-2016 12:54 PM
Hello,
I am stepping through this part of the HDP 2.5 tutorial:
I have executed this statement in the Hive view in Ambari under maria_dev:
CREATE TABLE riskfactor (driverid string,events bigint,totmiles bigint,riskfactor float) STORED AS ORC;
I have checked the table to be present in the default db and it is there.
After executing the following pig script:
a = LOAD 'geolocation' using org.apache.hive.hcatalog.pig.HCatLoader();
b = filter a by event != 'normal';
c = foreach b generate driverid, event, (int) '1' as occurance;
d = group c by driverid;
e = foreach d generate group as driverid, SUM(c.occurance) as t_occ;
g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();
h = join e by driverid, g by driverid;
final_data = foreach h generate $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor;
store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();
I get the following errors:
ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory WARNING: Use "yarn jar" to launch YARN applications. 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.5.0.0-1245 (rexported) compiled Aug 26 2016, 02:07:35 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:23,260 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 2016-09-27 11:51:23,453 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020 2016-09-27 11:51:24,818 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-8ca435c7-920a-4f44-953e-454a42973ab8 2016-09-27 11:51:25,478 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-09-27 11:51:25,671 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 2016-09-27 11:51:27,037 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,107 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,170 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:27,904 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,906 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,909 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,140 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_FLOAT 1 time(s). 2016-09-27 11:51:28,237 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:28,317 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:28,325 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,723 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: <file script.pig, line 9, column 0> Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms)
When I executed the script for the very first time I did not see any errors, but the riskfactor table was still empty and should have been populated.
Is there somebody that can help?
Created 09-27-2016 02:34 PM
It looks like there is a data type mismatch according to the error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms
The log indicates that it's attempting to store a DOUBLE into a target column that should be a BIGINT. It saying "in column 2(0-based)", so the problem is with totmiles.
Created 11-29-2016 05:53 PM
Hi,
I am not sure of what I did but it works: I dropped table RISKFACTOR in order to create it like this
driverid STRING
events BIGINT
totmiles DOUBLE
riskfactor FLOAT
then I run the script in the shell not in AMBARI and it worked (even with some warnings..)
pig -useHCatalog -f hdfs://sandbox.hortonworks.com/tmp/.pigscripts/riskfactoradmin-2016-11-28_06-42.pig
Hope it will help you.