Member since
01-12-2016
123
Posts
12
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1503 | 12-12-2016 08:59 AM |
12-15-2016
02:14 PM
I am new to oozie and can anyone explain what is the purpouse of input-events and output-events.I have gone through the manual but still it is not clear.can anyone explain input-event in below case. <coordinator-app name="MY_APP" frequency="1440" start="2009-02-01T00:00Z" end="2009-02-07T00:00Z" timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
<datasets>
<dataset name="input1" frequency="60" initial-instance="2009-01-01T00:00Z" timezone="UTC">
<uri-template>hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
</dataset>
</datasets>
<input-events>
<data-in name="coordInput1" dataset="input1">
<start-instance>${coord:current(-23)}</start-instance>
<end-instance>${coord:current(0)}</end-instance>
</data-in>
</input-events>
<action>
<workflow>
<app-path>hdfs://localhost:9000/tmp/workflows</app-path>
</workflow>
</action>
</coordinator-app>
... View more
Labels:
- Labels:
-
Apache Oozie
12-12-2016
08:59 AM
I got the answer as below: isempty_data_1 = filter t_1 by SIZE(failTime)>0; (23,{(6,Archana,Mishra,23,9848022335,Chennai)}) (24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})
... View more
12-08-2016
11:33 AM
I have grouped data as shown below: group_data = GROUP student_details by age;
dump group_data;
(21,{(4,Preethi,Agarwal,21,9848022330,Pune),(1,Rajiv,Reddy,21,9848022337,Hydera bad)})
(22,{(3,Rajesh,Khanna,22,9848022339,Delhi),(2,siddarth,Battacharya,22,984802233 8,Kolkata)})
(23,{(6,Archana,Mishra,23,9848022335,Chennai),(5,Trupthi,Mohanthy,23,9848022336 ,Bhuwaneshwar)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai),(7,Komal,Nayak,24,9848022334, trivendram)})
Required output from group_data is as below:
(23,{(6,Archana,Mishra,23,9848022335,Chennai)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})
Tried below thing: First Try:- t = foreach group_data generate group, FLATTEN(student_details1);
filter_data = FILTER t BY student_details::city == 'Chennai';
(23,6,Archana,Mishra,23,9848022335,Chennai)
(24,8,Bharathi,Nambiayar,24,9848022333,Chennai)
Second Try:- t_1 = FOREACH group_data {
t_2 = FILTER student_details BY city == 'Chennai';
GENERATE group,t_2 AS failTime;
};
(21,{})
(22,{})
(23,{(6,Archana,Mishra,23,9848022335,Chennai)})
(24,{(8,Bharathi,Nambiayar,24,9848022333,Chennai)})
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig
12-07-2016
12:19 PM
So if your load.sh produce below output
echo "output_1=success_1 output_2=success_2"
How to access success_2 value in switch case?
clarification 1:-"echo output_1=success_1 output_2=success_2"
Is below syntax is correct?
case to="end">${ wf:actionData('load-files')['output_2'] eq 'success_2'}</case>
clarification 2:-
For sqoop action to pass success_2 as a parameter.can we use like this.
Is it correct?
<arg>${ wf:actionData('load-files')['output_2']}</arg>
... View more
12-01-2016
02:51 PM
Hi @Greg Keys Thanks for input.may be my question is not clear.what will happen when we use z4 = distinct z2; How z4 is calculated from z2 is not clear.
... View more
12-01-2016
08:11 AM
x = LOAD '/pigdata/source.txt' using PigStorage(',') As (exchange:chararray, symbol:chararray, date:chararray, open:double, high:double, low:double, close:double, volume:long, adj_close:double);
y = GROUP x by symbol;
z2 = foreach y generate x.exchange as exchange1;
dump z2;
({(NASDAQ),(NASDAQ),(NASDAQ),(ICICI),(ICICI),(ICICI),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ)})
({(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ)})
z4 = distinct z2;
dump z4;
({(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ)})
({(NASDAQ),(NASDAQ),(NASDAQ),(ICICI),(ICICI),(ICICI),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ),(NASDAQ)})
clarification:-
How distinct will work with bags?For tuples it is clear and what will happen if i am using distinct with bags?dump z4 is not clear to me.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig
11-21-2016
10:29 AM
Hi @Greg Keys. 1)after using USING PigStorage() as (str:chararray); Issue is resolved.Thanks for your valuable time.
... View more
11-18-2016
05:20 PM
Hi @Greg Keys Thanks for input.your input is always appreciated.one clarification Then I should get warning during below filter statement but why i got warning during load statement.In load statement i am not converting bytearray to chararray. Then why i got warning during load statement? filter_records = FILTER ya BY $0 MATCHES '.*Hadoop.*';
... View more
11-18-2016
09:44 AM
Below is my source in HDFS:
/abc/ Hadoop is an open source
MR is to process data in hadoop. Hadoop has a good eco system. I want to do below opearation
filter_records = FILTER ya BY $0 MATCHES '.*Hadoop.*';
but load command is unsuccessful.Could anybody provide input on load statement?
grunt> ya = load '/abc/' USING TextLoader();
2016-11-17 21:00:14,470 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt> yab = load '/abc/';
2016-11-17 21:00:50,199 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt>
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig
11-17-2016
08:00 AM
Hi @Josh Elser. I got the answer by unix command as mentiond below. Command:-
echo "scan 'emp" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}' | awk -F ':' '{print $2}'|awk -F ',' '{print $1}'
... View more