- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Data is not coming in proper order while storing output results from pig into hive table
- Labels:
-
Apache HCatalog
-
Apache Hive
-
Apache Pig
Created ‎10-30-2017 02:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Two csv sheet data is loaded into the pig and applied the union operation on that and below is the command with its output in PIG command-in-pig.png pig-output.png but while storing this output from pig into hive table which I have already created,the data is not coming in correct order and below are the result of that pig output in hive table hive-output.png
Created ‎10-30-2017 04:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By order do you mean the sequence of row in pig output and hive output .
If yes then that will never match.
Pig writes into file in hdfs using MR / TEZ .
This file is mapped to a hive table.
When you run a query on hive select * from Table , hive spawns a Map only job to read the file. now which mapper completes first and display output is not determenistic.
Created ‎10-30-2017 04:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By order do you mean the sequence of row in pig output and hive output .
If yes then that will never match.
Pig writes into file in hdfs using MR / TEZ .
This file is mapped to a hive table.
When you run a query on hive select * from Table , hive spawns a Map only job to read the file. now which mapper completes first and display output is not determenistic.
Created ‎10-31-2017 04:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thankyou @kgautam for this information. It will be helpful if you suggest me any other way through which I can get the data in correct row sequence in hive table through PIG.
