Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

I am looking to generate sequence no in a file .Used RANK, but it's failing for files > 10GB. Here is the code ​temp = LOAD 'abc.txt' using PigStorage(';','-tagFile'); test = RANK temp; DUMP test;

avatar
 
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Koti P

I don't see a problem with your code, I'm able to execute your code using HDP 2.4 Sandbox

temp = LOAD 'abc.txt' using PigStorage(';','-tagFile');
test = RANK temp;
DUMP test;

my abc.txt looks like so

David,1,N
Tete,2,N
Ranjit,3,M
Ranjit,3,P
David,4,Q
David,4,Q
Jillian,8,Q
JaePak,7,Q
Michael,8,T
Jillian,8,Q
Jose,10,V

and my output looks like so:

(1,abc.txt,David,1,N)
(2,abc.txt,Tete,2,N)
(3,abc.txt,Ranjit,3,M)
(4,abc.txt,Ranjit,3,P)
(5,abc.txt,David,4,Q)
(6,abc.txt,David,4,Q)
(7,abc.txt,Jillian,8,Q)
(8,abc.txt,JaePak,7,Q)
(9,abc.txt,Michael,8,T)
(10,abc.txt,Jillian,8,Q)
(11,abc.txt,Jose,10,V)

I used tez as executing engine

pig -x tez

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@Koti P

I don't see a problem with your code, I'm able to execute your code using HDP 2.4 Sandbox

temp = LOAD 'abc.txt' using PigStorage(';','-tagFile');
test = RANK temp;
DUMP test;

my abc.txt looks like so

David,1,N
Tete,2,N
Ranjit,3,M
Ranjit,3,P
David,4,Q
David,4,Q
Jillian,8,Q
JaePak,7,Q
Michael,8,T
Jillian,8,Q
Jose,10,V

and my output looks like so:

(1,abc.txt,David,1,N)
(2,abc.txt,Tete,2,N)
(3,abc.txt,Ranjit,3,M)
(4,abc.txt,Ranjit,3,P)
(5,abc.txt,David,4,Q)
(6,abc.txt,David,4,Q)
(7,abc.txt,Jillian,8,Q)
(8,abc.txt,JaePak,7,Q)
(9,abc.txt,Michael,8,T)
(10,abc.txt,Jillian,8,Q)
(11,abc.txt,Jose,10,V)

I used tez as executing engine

pig -x tez

avatar

Thanks for the answer. I was out of town and could not get back. I have tested and looked good if we run with 'tez' engine. Is there anyway we can test with oozie flow. The same I am not able to test with oozie as it's taking mapreduce mode