Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

pig nested for each clarification

pig nested for each clarification

Contributor

I am new to pig and any input is really appreciated

source file:

Exchange,Symbol,date,open,high,low,close,volume,adj_close
NASDAQ,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
NASDAQ,JDAS,2010-01-28,29.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-27,27.48,27.93,27.20,27.68,560100,27.68
ICICI,JDAS,2010-02-08,25.41,26.59,25.15,26.46,488900,26.46
ICICI,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
ICICI,JDAS,2010-01-28,27.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
NASDAQ,JDAS,2010-01-28,27.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-27,27.48,27.93,27.20,27.68,560100,27.68
NASDAQ,JDAS,2010-02-08,25.41,26.59,25.15,26.46,488900,26.46
NASDAQ,JDAS,2010-02-05,25.42,25.84,24.94,25.49,1121700,25.49
NASDAQ,JDAS,2010-02-04,26.53,26.61,25.46,25.46,574900,25.46
NASDAQ,JDAS,2009-12-31,25.97,26.13,25.47,25.47,283600,25.47
NASDAQ,JDAS,2009-12-30,25.74,26.25,25.61,26.05,236300,26.05
NASDAQ,JDAS,2009-12-29,25.98,25.98,25.52,25.76,238600,25.76
NASDAQ,JDAS,2009-11-30,23.39,23.65,22.78,23.48,522000,23.48
NASDAQ,JDAS,2009-11-27,23.12,23.71,23.10,23.54,144900,23.54
NASDAQ,JDAS,2009-11-25,23.96,24.00,23.59,23.82,220400,23.82
NASDAQ,JOEZ,2010-01-29,1.68,1.69,1.60,1.60,158900,1.60
NASDAQ,JOEZ,2010-01-28,1.64,1.70,1.61,1.62,250700,1.62
NASDAQ,JOEZ,2010-01-27,1.73,1.76,1.63,1.64,329200,1.64
NASDAQ,JOEZ,2010-01-26,1.70,1.76,1.66,1.70,509100,1.70
NASDAQ,JOEZ,2010-01-25,1.64,1.68,1.60,1.68,169600,1.68
NASDAQ,JOEZ,2010-02-08,1.80,2.04,1.76,1.93,1712200,1.93
NASDAQ,JOEZ,2010-02-05,1.84,1.88,1.70,1.80,1044700,1.80
NASDAQ,JOEZ,2010-02-04,1.96,1.97,1.74,1.88,3758600,1.88
NASDAQ,JOEZ,2010-02-03,1.73,1.79,1.68,1.72,1211700,1.72
NASDAQ,JOEZ,2010-02-02,1.59,1.72,1.51,1.70,909400,1.70
NASDAQ,JOEZ,2009-07-15,1.00,1.05,0.75,0.81,1215200,0.81
NASDAQ,JOEZ,2009-07-14,0.80,0.95,0.80,0.93,580000,0.93
NASDAQ,JOEZ,2009-07-13,0.80,0.83,0.75,0.79,148100,0.79
NASDAQ,JOEZ,2009-05-06,0.56,0.67,0.55,0.58,83800,0.58
NASDAQ,JOEZ,2009-05-05,0.63,0.63,0.58,0.58,68700,0.58
NASDAQ,JOEZ,2009-05-04,0.62,0.68,0.60,0.63,134400,0.63


x = LOAD '/home/prime23/source.txt' using PigStorage(',') As (exchange:chararray, symbol:chararray, date:chararray, open:double, high:double, low:double, close:double, volume:long, adj_close:double);


query:-For each symbol get me all distinct exchanges:

y = GROUP x by symbol;
z1 = foreach y {
     t = distinct x.exchange;
     generate group, t;
}

clarifications:

1)Here we have two symbols(JOEZ,JDAS) so nested foreach will iterate for two times.Please correct me if i am wrong?

2)How to get schema of t relation.describe is not working.

3)last statement is not clear:

y relation contains only(group,x) fields.How can we select t field which is not present in y relation.

7 REPLIES 7

Re: pig nested for each clarification

Expert Contributor

why not group by both symbol and exchange?

Y = Group x by symbol, exchange

z1 = foreach y {

generate group.symbol,group.exchange

Re: pig nested for each clarification

Contributor

Hi @Karthik Narayanan

Thanks for input.see my clarifications in orginal post.I am looking for inputs on those things

Re: pig nested for each clarification

Expert Contributor

i dont understand what you are trying to do here... are you trying to get a flat pair of symbol, exchange pairs..

ex JDAS,NASDAQ

JDAS,ICICI

JOEZ,NASDAQ

you clarifications are not clear as to what you are trying to achieve.

Re: pig nested for each clarification

Contributor

I want to get the distinct exchanges for each symbol.

I already have pig script for that but i have some clarifications which is mentioned in orginal post

Re: pig nested for each clarification

Contributor

Hi friends,

Any input on my clarifications is appreciated since i am beginner for Hadoop and pig.

Re: pig nested for each clarification

Expert Contributor

1) iit will loop for each distinct symbol you have in the data, in this case 2 so yes 2 times.

2) not sure why the describe fails, distinct is a relational operator, i.e you normally won't be able to do a distinct (x.exchange) ... ideally to do this

exchanges = foreach y generate x.exchange;

unique_exchanges = distinct exchanges;

may be pig is doing something internally due to the curly braces.

3) generate group, t, will basically generate a cross product of group ( which is the key you used to group) and t. if you want it to generate a separate value for each exchange you can use generate group , flatten(t).

hope this helps.

Highlighted

Re: pig nested for each clarification

Contributor

Hi @Karthik Narayanan

Thanks for your time. last one is not clarified.

generate group, t; is same as for each y generate group, t;

Y relation contains only two columns(group,x).How you will select t?of course t is calculated inside nested for each.

can we select any column( like t) in generate statement of nested for each?

Don't have an account?
Coming from Hortonworks? Activate your account here