I'm trying to load data from pig to hive using hcatalog. It's just for training...
I have the following data:
year name salary 2015 Marc 100 2016 Marc 200 2017 Marc 300 2015 Lucy 100 2016 Lucy 200 2017 Lucy 300 2015 John 100 2016 John 200 2017 John 300
i created a table on Hive:
create table salary ( year int, name string, salary int );
and the following script in pig:
a = load '/user/horton/salary'; b = FOREACH a GENERATE $0 as year:int, $1 as name:chararray, $2 as salary:int; store b into 'salary' using org.apache.hive.hcatalog.pig.HCatStorer();
but, calling pig -useHCatalog, I obtain an error:
prg.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer
Any suggestion will be appreciated.
Hey Mauro ,
You seem to have a data type compatibility issue ...
Which version of Hive and Pig are you using ?
Also make sure that l FOREACH statement in your Pig script matches with Hive DDL schema. if you dont provide a schema evrything will be considered byteArray
Ok , when you load your file in pig , could you use a pig storage with the proper field separator ? similar to the following
A = LOAD 'myfile.txt' USING PigStorage('\t') AS (f1,f2,f3);