Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Data is not coming in proper order while storing output results from pig into hive table

avatar
New Member

Two csv sheet data is loaded into the pig and applied the union operation on that and below is the command with its output in PIG command-in-pig.png pig-output.png but while storing this output from pig into hive table which I have already created,the data is not coming in correct order and below are the result of that pig output in hive table hive-output.png

1 ACCEPTED SOLUTION

avatar

By order do you mean the sequence of row in pig output and hive output .

If yes then that will never match.

Pig writes into file in hdfs using MR / TEZ .

This file is mapped to a hive table.
When you run a query on hive select * from Table , hive spawns a Map only job to read the file. now which mapper completes first and display output is not determenistic.

View solution in original post

2 REPLIES 2

avatar

By order do you mean the sequence of row in pig output and hive output .

If yes then that will never match.

Pig writes into file in hdfs using MR / TEZ .

This file is mapped to a hive table.
When you run a query on hive select * from Table , hive spawns a Map only job to read the file. now which mapper completes first and display output is not determenistic.

avatar
New Member

Thankyou @kgautam for this information. It will be helpful if you suggest me any other way through which I can get the data in correct row sequence in hive table through PIG.