Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Apache PIG - Guarantee that all the value in a column are Numeric Values

avatar
New Member

Hi experts, I've a dataset with 4 columns and want to know if the column B only have numbers, if the job detect some non numeric value I want to put that value into null. Could I do this in PIG or must be Python embed pig? Many thanks!

1 ACCEPTED SOLUTION

avatar
Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

avatar
New Member

Brilliant 🙂 Only one more question: How can I add a Case Statement (or a If) to my X var;

avatar
Super Collaborator

You can apply a filter or bincond operator on any column(s) of your relation X. You can get more details on available operators here