Created 07-27-2016 04:35 PM
Hi experts, I've a dataset with 4 columns and want to know if the column B only have numbers, if the job detect some non numeric value I want to put that value into null. Could I do this in PIG or must be Python embed pig? Many thanks!
Created 07-27-2016 07:13 PM
You can use org.apache.pig.piggybank.evaluation.IsNumeric
Some thing like
X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)I have applied generate for one column, you can add rest of columns
Created 07-27-2016 07:13 PM
You can use org.apache.pig.piggybank.evaluation.IsNumeric
Some thing like
X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)I have applied generate for one column, you can add rest of columns
Created 07-27-2016 10:13 PM
Brilliant 🙂 Only one more question: How can I add a Case Statement (or a If) to my X var;
Created 07-28-2016 01:26 PM
You can apply a filter or bincond operator on any column(s) of your relation X. You can get more details on available operators here