I want to generate sequence number for a block of record(like A,B,C as one block), input size is around 50 GB
I'm not able to achieve the output as above since Parallel processing can't be done. Since the input split into multiple part file,
we are not able to achieve the result.
Is there a way we can able to generate the KEY,
when ever the A record come the key has to increment.