Created 03-07-2017 01:20 PM
This is my script and i have also a table called riskfactor in Hive . a = LOAD 'geolocation' using org.apache.hive.hcatalog.pig.HCatLoader(); b = filter a by event != 'normal'; c = foreach b generate driverid, event, (int) '1' as occurance; d = group c by driverid; e = foreach d generate group as driverid, SUM(c.occurance) as t_occ; g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader(); h = join e by driverid, g by driverid; final_data = foreach h generate $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor; store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer(); When i execute this pig script it shows me that error and not saving any data. I did step by step everything it says here :
Created 03-07-2017 02:30 PM
hello @voca voca ,
I ran into the same problem but realized that the totalmiles column within Hive should be a Double Column and not an INT as described in the tutorial.
So if you take this block of code below and rerun in hive view. It should work for you.
drop table riskfactor; CREATE TABLE riskfactor (driverid string,events bigint,totmiles double,riskfactor float) STORED AS ORC;
Created 03-07-2017 02:30 PM
hello @voca voca ,
I ran into the same problem but realized that the totalmiles column within Hive should be a Double Column and not an INT as described in the tutorial.
So if you take this block of code below and rerun in hive view. It should work for you.
drop table riskfactor; CREATE TABLE riskfactor (driverid string,events bigint,totmiles double,riskfactor float) STORED AS ORC;
Created 03-07-2017 02:32 PM
Once you recreate the hive table try to rerun your Pig script. Inside the pig script don't forget to add the argument -useHCatalog.....
Created 03-07-2017 02:57 PM
such a compiler !