Created 03-01-2016 05:46 PM
HI:
its posible to split in pig www.amazon.es? iam doing this but doest work:
orders4 = FOREACH orders3 GENERATE $0 as freq, STRSPLIT($1,'.') as word;
its just print thr $0. any suggestions??
Created 03-01-2016 06:12 PM
take a look at this example, it's not exactly what you want but shows a working example.
Created 03-01-2016 05:49 PM
@Roberto Sancho can you clarify, are you trying to split url "www.amazon.es" by "."?
Created 03-01-2016 05:52 PM
Hi:
yes with '.' doesnt work, but its work with '-'
Created 03-01-2016 06:12 PM
take a look at this example, it's not exactly what you want but shows a working example.
Created 03-01-2016 06:19 PM
@Roberto Sancho since this is a regex expression, '.' is taken as regex instead of a separator. It won't work, I just tried with # as my example in the link and it also worked but with dot it doesn't.
Created 03-01-2016 06:39 PM
@Roberto Sancho I got it, I followed advice from http://stackoverflow.com/questions/24981431/strsplit-in-pig-functions
grunt> a = load 'test2' using PigStorage() as (str:chararray); grunt> b = foreach a generate STRSPLIT($0,'\\u002E') as word; grunt> dump b; ((www,amazon,es))
Created 03-01-2016 08:05 PM
Hi:
the last solution was fine, but also work this:
orders4 = FOREACH orders3 GENERATE $0 as freq, (chararray) ((word matches '.*..*') ? SUBSTRING(word,INDEXOF(word,'.',0)+1,LAST_INDEX_OF(word,'.')) : $1) as word;