Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache PIG - Guarantee that all the value in a column are Numeric Values

Solved Go to solution
Highlighted

Apache PIG - Guarantee that all the value in a column are Numeric Values

Explorer

Hi experts, I've a dataset with 4 columns and want to know if the column B only have numbers, if the job detect some non numeric value I want to put that value into null. Could I do this in PIG or must be Python embed pig? Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Apache PIG - Guarantee that all the value in a column are Numeric Values

Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

View solution in original post

3 REPLIES 3
Highlighted

Re: Apache PIG - Guarantee that all the value in a column are Numeric Values

Super Collaborator

You can use org.apache.pig.piggybank.evaluation.IsNumeric

Some thing like

X = foreach Y generate ((org.apache.pig.piggybank.evaluation.IsNumeric($1)==true)?(int)$1:null)
I have applied generate for one column, you can add rest of columns

View solution in original post

Highlighted

Re: Apache PIG - Guarantee that all the value in a column are Numeric Values

Explorer

Brilliant :) Only one more question: How can I add a Case Statement (or a If) to my X var;

Highlighted

Re: Apache PIG - Guarantee that all the value in a column are Numeric Values

Super Collaborator

You can apply a filter or bincond operator on any column(s) of your relation X. You can get more details on available operators here

Don't have an account?
Coming from Hortonworks? Activate your account here