About vamsi123

vamsi123 · ‎01-01-2017

HI @Greg Keys Happy New year.Could you please provide below two clarifications. clarification 1:- Let us say my input is:- 1;(7287026502032012,18);{(706),(707)};{(101200010),(101200011)};{(17286),(17287)};{(oz),(oz1)};2.5 The expression for data_flattened is same and in that case whether my understanding is correct? Is below output is correct? Output:- 1;7287026502032012,18;706,707;101200010,101200011;17286,17287;oz,oz1;2.5 clarification 2:- Let us say my input is:- 1;(7287026502032012,18);{(706),(707)};{(101200010),(101200011)};{(17286),(17287)};{(oz),(oz1)};2.5 data_flattened_1 = FOREACH data GENERATE $0, FLATTEN ($1), FLATTEN($2), FLATTEN($3), FLATTEN($4), FLATTEN($5), $6; The expression for data_flattened_1 is mentioned above and in that case whether my understanding is correct? Is below output is correct? Output:- 1;7287026502032012,18;706;101200010;17286;oz;2.5 1;7287026502032012,18;707;101200011;17287;oz1;2.5

vamsi123 · ‎12-31-2016

a)could you please provide source for this link and it is really useful b)what about these two in the diagram and where it will come? grouping shuffle and merge:each reducer will take one partition from all map tasks and merge together

vamsi123 · ‎12-30-2016

what is the order of execution for mapreduce Job? Is it correct and please correct me if i am wrong? Mapper partition each mapper output sorting with in each partition based on key grouping shuffle and merge:each reducer will take one partition from all map tasks and merge together combiner reducer

vamsi123 · ‎12-29-2016

HI @Michael M For first option:- In production can I place below command in shell script and schedule that script using crontab so that it will run the Flume will run continuously since In production environment we are not allowed to run any command manually on gateway node.Please correct me if i am wrong? nohup <my_command> &

vamsi123 · ‎12-29-2016

HI @Michael M Thanks alot for your time.one small clarification You mentioned good approach is to keep Flume running all the time. And schedule oozie jobs to process the data whenever you need. clarification 1:- How to keep Flume running all the time?currently i am using below command on my gateway node. flume-ng agent --conf $FLUME_CONF_DIR --conf-file $FLUME_CONF_DIR/flume.conf --name Agent7

vamsi123 · ‎12-29-2016

Hi @Mats Johansson Any input on my clarifications

vamsi123 · ‎12-27-2016

a)I am starting flume agent using below command.In production how we will trigger this command currently I am running manually on unix command prompt and also i want to create dependeny with hive? b)can i place below command in unix shell script and call it in shell action in oozie? flume-ng agent --conf $FLUME_CONF_DIR --conf-file $FLUME_CONF_DIR/flume.conf --name Agent7

vamsi123 · ‎12-27-2016

We are using Flume to get the data into HDFS.After that we are running pig, hive for data transformation.Not sure how to trigger flume from oozie?

vamsi123 · ‎12-16-2016

Hi @Eyad Garelnabi Thanks it answers my question but in oozie textbook they mentioned we can calculate current(0) using formulae current(0) = dsII + dsF * (0 + (caNT – dsII) / dsF). What is the problem with my calculation since i am not able to get the 2014-10-18T06:00Z with the formalue.

vamsi123 · ‎12-16-2016

Hi @Eyad Garelnabi Thanks for Input regarding current(0) one clarification is there.When i check oozie textbook below formula is present current(n) = dsII + dsF * (n + (caNT – dsII) / dsF) current(0) = dsII + dsF * (n + (caNT – dsII) / dsF) = 2014-10-06T06:00Z + 3 day x (0 + (2014-10-19T06:00Z - 2014-10-06T06:00Z))/ 3 day = 2014-10-06T06:00Z + 3 day *(13)/3 = 2014-10-06T06:00Z +(13)=2014-10-19T06:00Z but when i check textbook page 127 they mentioned as 2014-10-18T06:00Z not sure what i am missing.

Online	Offline
Last Visited	‎01-29-2018 02:47 AM

Member Since	‎01-12-2016 07:23 AM
Last Visited	‎01-29-2018 02:47 AM
Posts	123
Kudos received	12

Cloudera Community

Re: Pig converting tuple to bag

Re: Using PIG Latin to replace multiple strings fr...

Re: Map reduce flow clarification

Map reduce flow clarification

Re: Flume with oozie

Re: Flume with oozie

Re: Flume with oozie

Re: Flume with oozie

Flume with oozie

Re: oozie Input events clarification

Re: oozie Input events clarification