Created 07-11-2016 06:54 AM
Tried using PIG Latin but i am not able to parse it. Any LDIF parser present in the piggybank library? wrote regex but using that all my data is going in $0 column. Not able to filter the required information.
Please help. is pig the only way to parse it or there are other possiblities also. thank you.
Could you please help with more details on it like your code and usage?
commands that I am using are
A= LOAD '/local/home/kpi_dev/pig/Pig files to be parsed/LDAP/root.ldif_05240000' as (line:chararray);
B = foreach A generate REGEX_EXTRACT_ALL(‘$0’,'(.*):(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)');
and another that I used is
C = foreach A generate REGEX_EXTRACT_ALL('$0,'="ou=[^,]');
Just check using below and modify accordingly as with my sample file i can see the data populate properly
B = FOREACH A GENERATE FLATTEN(REGEX_EXTRACT_ALL($0, '(.*):(.*)=(.*),(.*)')) AS (id:chararray, name:chararray, nameid:chararray);