Support Questions

Find answers, ask questions, and share your expertise

Apache PIG - Guarantee that all the value in a column are Numeric Values

avatar
Contributor

Hi experts, I've a dataset with 4 columns and want to know if the column B only have numbers, if the job detect some non numeric value I want to put that value into null. Could I do this in PIG or must be Python embed pig? Many thanks!

1 ACCEPTED SOLUTION

avatar
Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

avatar
Contributor

Brilliant 🙂 Only one more question: How can I add a Case Statement (or a If) to my X var;

avatar
Super Collaborator

You can apply a filter or bincond operator on any column(s) of your relation X. You can get more details on available operators here