Support Questions

Find answers, ask questions, and share your expertise

Load Hive Table form Pig Output File.

avatar
Explorer

For submit Hive, Pig job in Azure HDInsight, I am using .Net SDK. After successfully output of Pig Job ,I used Pig output file to load into Hive table. Hive job executed successfully but Pig output is in BAG format , so its store data in Hive table like "(" , ")" characters. Means First and Last column of Hive table having these characters. I want to remove this characters from Hive table. Should I have to change the Pig output format or Is this anyway to remove above character from Hive Table ?

Please give solution.

Thank You.

1 ACCEPTED SOLUTION

avatar
Master Mentor

You can use flatten operator to remove bag thus removing the extra characters http://pig.apache.org/docs/r0.16.0/basic.html#flatten so before you finish generating the file with Pig, call the flatten operator and then load it in aHive table

grunt> cat empty.bag
{}      1
grunt> A = LOAD 'empty.bag' AS (b : bag{}, i : int);
grunt> B = FOREACH A GENERATE flatten(b), i;
grunt> DUMP B;
grunt>

View solution in original post

2 REPLIES 2

avatar
Master Mentor

You can use flatten operator to remove bag thus removing the extra characters http://pig.apache.org/docs/r0.16.0/basic.html#flatten so before you finish generating the file with Pig, call the flatten operator and then load it in aHive table

grunt> cat empty.bag
{}      1
grunt> A = LOAD 'empty.bag' AS (b : bag{}, i : int);
grunt> B = FOREACH A GENERATE flatten(b), i;
grunt> DUMP B;
grunt>

avatar

Hi @Ishvari Dhimmar

As stated by Ervits after flattening the pig output to remove bags if there are any use the file to load into a hive table.

For loading into a hive table use " load data inpath <pig out output file> into table <hive table name>".

Ensure that the format of the pig output file is compatible with hive native formats. If you do so then you will achieve what you are looking for.