Support Questions
Find answers, ask questions, and share your expertise

pig nested for each clarification

Highlighted

pig nested for each clarification

Contributor

I am new to pig and any input is really appreciated

source file:

Exchange,Symbol,date,open,high,low,close,volume,adj_close
NASDAQ,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
NASDAQ,JDAS,2010-01-28,29.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-27,27.48,27.93,27.20,27.68,560100,27.68
ICICI,JDAS,2010-02-08,25.41,26.59,25.15,26.46,488900,26.46
ICICI,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
ICICI,JDAS,2010-01-28,27.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-29,26.91,27.53,26.02,26.21,883100,26.21
NASDAQ,JDAS,2010-01-28,27.86,27.97,26.84,26.88,1272600,26.88
NASDAQ,JDAS,2010-01-27,27.48,27.93,27.20,27.68,560100,27.68
NASDAQ,JDAS,2010-02-08,25.41,26.59,25.15,26.46,488900,26.46
NASDAQ,JDAS,2010-02-05,25.42,25.84,24.94,25.49,1121700,25.49
NASDAQ,JDAS,2010-02-04,26.53,26.61,25.46,25.46,574900,25.46
NASDAQ,JDAS,2009-12-31,25.97,26.13,25.47,25.47,283600,25.47
NASDAQ,JDAS,2009-12-30,25.74,26.25,25.61,26.05,236300,26.05
NASDAQ,JDAS,2009-12-29,25.98,25.98,25.52,25.76,238600,25.76
NASDAQ,JDAS,2009-11-30,23.39,23.65,22.78,23.48,522000,23.48
NASDAQ,JDAS,2009-11-27,23.12,23.71,23.10,23.54,144900,23.54
NASDAQ,JDAS,2009-11-25,23.96,24.00,23.59,23.82,220400,23.82
NASDAQ,JOEZ,2010-01-29,1.68,1.69,1.60,1.60,158900,1.60
NASDAQ,JOEZ,2010-01-28,1.64,1.70,1.61,1.62,250700,1.62
NASDAQ,JOEZ,2010-01-27,1.73,1.76,1.63,1.64,329200,1.64
NASDAQ,JOEZ,2010-01-26,1.70,1.76,1.66,1.70,509100,1.70
NASDAQ,JOEZ,2010-01-25,1.64,1.68,1.60,1.68,169600,1.68
NASDAQ,JOEZ,2010-02-08,1.80,2.04,1.76,1.93,1712200,1.93
NASDAQ,JOEZ,2010-02-05,1.84,1.88,1.70,1.80,1044700,1.80
NASDAQ,JOEZ,2010-02-04,1.96,1.97,1.74,1.88,3758600,1.88
NASDAQ,JOEZ,2010-02-03,1.73,1.79,1.68,1.72,1211700,1.72
NASDAQ,JOEZ,2010-02-02,1.59,1.72,1.51,1.70,909400,1.70
NASDAQ,JOEZ,2009-07-15,1.00,1.05,0.75,0.81,1215200,0.81
NASDAQ,JOEZ,2009-07-14,0.80,0.95,0.80,0.93,580000,0.93
NASDAQ,JOEZ,2009-07-13,0.80,0.83,0.75,0.79,148100,0.79
NASDAQ,JOEZ,2009-05-06,0.56,0.67,0.55,0.58,83800,0.58
NASDAQ,JOEZ,2009-05-05,0.63,0.63,0.58,0.58,68700,0.58
NASDAQ,JOEZ,2009-05-04,0.62,0.68,0.60,0.63,134400,0.63


x = LOAD '/home/prime23/source.txt' using PigStorage(',') As (exchange:chararray, symbol:chararray, date:chararray, open:double, high:double, low:double, close:double, volume:long, adj_close:double);


query:-For each symbol get me all distinct exchanges:

y = GROUP x by symbol;
z1 = foreach y {
     t = distinct x.exchange;
     generate group, t;
}

clarifications:

1)Here we have two symbols(JOEZ,JDAS) so nested foreach will iterate for two times.Please correct me if i am wrong?

2)How to get schema of t relation.describe is not working.

3)last statement is not clear:

y relation contains only(group,x) fields.How can we select t field which is not present in y relation.

7 REPLIES 7
Highlighted

Re: pig nested for each clarification

Expert Contributor

why not group by both symbol and exchange?

Y = Group x by symbol, exchange

z1 = foreach y {

generate group.symbol,group.exchange

Highlighted

Re: pig nested for each clarification

Contributor

Hi @Karthik Narayanan

Thanks for input.see my clarifications in orginal post.I am looking for inputs on those things

Highlighted

Re: pig nested for each clarification

Expert Contributor

i dont understand what you are trying to do here... are you trying to get a flat pair of symbol, exchange pairs..

ex JDAS,NASDAQ

JDAS,ICICI

JOEZ,NASDAQ

you clarifications are not clear as to what you are trying to achieve.

Re: pig nested for each clarification

Contributor

I want to get the distinct exchanges for each symbol.

I already have pig script for that but i have some clarifications which is mentioned in orginal post

Highlighted

Re: pig nested for each clarification

Contributor

Hi friends,

Any input on my clarifications is appreciated since i am beginner for Hadoop and pig.

Highlighted

Re: pig nested for each clarification

Expert Contributor

1) iit will loop for each distinct symbol you have in the data, in this case 2 so yes 2 times.

2) not sure why the describe fails, distinct is a relational operator, i.e you normally won't be able to do a distinct (x.exchange) ... ideally to do this

exchanges = foreach y generate x.exchange;

unique_exchanges = distinct exchanges;

may be pig is doing something internally due to the curly braces.

3) generate group, t, will basically generate a cross product of group ( which is the key you used to group) and t. if you want it to generate a separate value for each exchange you can use generate group , flatten(t).

hope this helps.

Highlighted

Re: pig nested for each clarification

Contributor

Hi @Karthik Narayanan

Thanks for your time. last one is not clarified.

generate group, t; is same as for each y generate group, t;

Y relation contains only two columns(group,x).How you will select t?of course t is calculated inside nested for each.

can we select any column( like t) in generate statement of nested for each?

Don't have an account?