- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
defining datatype in Pig
- Labels:
-
Apache Hadoop
-
Apache Pig
Created ‎03-04-2017 01:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Im new to pig script to pardon me if my question is lame. I know that we can define a datatype for each atom in pig while loading it from a file. But is there a way we can define the datatype after taking a subset of it?
Example:
data = load 'mydata.csv' using PigStrogae(',') AS (col1:int, col2:int);
subsetdata = foreach data generate col1; --> Here i need to define the col1 as int . Is there a way to feed it?
Created ‎03-04-2017 02:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
col1 is already int based on your schema in load statement. You can check with
describe data;
If you want to change type with generate, you can do so like this
X = FOREACH A GENERATE c1 AS x1:int;
Created ‎03-04-2017 02:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
col1 is already int based on your schema in load statement. You can check with
describe data;
If you want to change type with generate, you can do so like this
X = FOREACH A GENERATE c1 AS x1:int;
