Support Questions

balavignesh_nag · ‎03-04-2017

Im new to pig script to pardon me if my question is lame. I know that we can define a datatype for each atom in pig while loading it from a file. But is there a way we can define the datatype after taking a subset of it?

Example:

data = load 'mydata.csv' using PigStrogae(',') AS (col1:int, col2:int);

subsetdata = foreach data generate col1; --> Here i need to define the col1 as int . Is there a way to feed it?

aervits · ‎03-04-2017

col1 is already int based on your schema in load statement. You can check with

describe data;

If you want to change type with generate, you can do so like this

X = FOREACH A GENERATE c1 AS x1:int;

View solution in original post

aervits · ‎03-04-2017

col1 is already int based on your schema in load statement. You can check with

describe data;

If you want to change type with generate, you can do so like this

X = FOREACH A GENERATE c1 AS x1:int;

Cloudera Community

Support Questions

defining datatype in Pig