Member since
01-28-2016
7
Posts
4
Kudos Received
0
Solutions
02-01-2016
01:25 PM
1 Kudo
Thanks for the help. I am able to run the falcon process now
... View more
02-01-2016
08:14 AM
@Sowmya RameshHi Thank you for the patience . I am very new to falcon and having issues sorting this out. I have changed my process time as mentioned by you , however I still do not see any output folder generated. Please find mt XML for process as below <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <process name="demo1processNew" xmlns="uri:falcon:process:0.1"> <tags>process_name=demo1processNew</tags> <clusters> <cluster name="demo1Cluster-New"> <validity start="2016-01-01T00:00Z" end="2018-01-04T14:00Z"/> </cluster> </clusters> <parallel>1</parallel> <order>FIFO</order> <frequency>minutes(5)</frequency> <timezone>GMT+05:50</timezone> <inputs> <input name="inputgroup" feed="demo1FeedInputNew" start="currentMonth(0,0,0)" end="currentMonth(31,0,0)"/> </inputs> <outputs> <output name="outputgroup" feed="demo1OutputFeedNew" instance="currentMonth(0,0,0)"/> </outputs> <workflow name="demo1processNew" engine="pig" path="/falcon/demo1/code/demo1.pig"/> <retry policy="exp-backoff" delay="minutes(5)" attempts="1"/> <ACL owner="falcon" group="supoergroup" permission="0755"/> </process> Kindly advice. I would like to see one flow where my data is picked from /falcon/demo1/data/2016-01 folder and executed.
... View more
01-30-2016
07:55 AM
Hi Thanks for the replies. I have I have changed all my feeds to monthly. Please find the feed and process XML. Now again the problem is back to square 1. I can run the feeds and processes but process is not picking up the data from folder and processing. InputFeed Data Path = /falcon/demo1/data/${YEAR}-${MONTH} InputFeed Frequency = 1 month Process Instance Start = currentMonth(0,0,0) End = currentMonth(31,0,0) Actual Data path = /falcon/demo1/data/2016-01 pig script= A = LOAD '$inputgroup' using PigStorage(',') AS (trnid:chararray, custid:chararray,age:int,trndt:chararray,trntm:chararray,mcc:chararray,mcccode:int,amt:chararray);
B = FILTER A BY (mcc == 'Airlines');
STORE B INTO '$outputgroup' ;
... View more
01-29-2016
12:51 PM
Thanks for the responses. I have created a new feed with currentMonth interval for feeds . However when I setup my process with the feeds I get the following error. Any help woule be greatly appreciated Error: Start instance currentMonth(0,0,0) of feed demo1FeedInputNew is before the start of feed Tue Dec 01 05:19:00 EST 2015 for cluster demo1Cluster-New (FalconWebException:83)
2016-01-29 07:48:56,050 ERROR - [1495236611@qtp-2044215423-59 - 2c35b9f5-049a-4636-bff1-4d7a951c6151:falcon:POST//entities/submit/process] ~ Action failed: Bad Request
Error: default/org.apache.falcon.FalconWebException::org.apache.falcon.FalconException: Start instance currentMonth(0,0,0) of feed demo1FeedInputNew is before the start of feed Tue Dec 01 05:19:00 EST 2015 for cluster demo1Cluster-New
... View more
01-29-2016
12:29 PM
@Sowmya Ramesh Thank you for the response. I am trying to recreate the feed XML with monthly feeds. Please find below my XML <feed xmlns='uri:falcon:feed:0.1' name='demo1InputFeed' description='demo1 input feed'>
<tags>feed_name=demo1InputFeed</tags>
<groups>input</groups>
<frequency>months(1)</frequency>
<timezone>GMT+05:50</timezone>
<late-arrival cut-off='days(3)'/>
<clusters>
<cluster name='demo1cluster' type='source'>
<validity start='2016-01-28T06:59Z' end='2017-02-01T06:59Z'/>
<retention limit='months(2)' action='delete'/>
<locations>
<location type='data'>
</location>
<location type='stats'>
</location>
<location type='meta'>
</location>
</locations>
</cluster>
</clusters>
<locations>
<location type='data' path='/falcon/demo1/data/${YEAR}-${MONTH}'>
</location>
<location type='stats' path='/falcon/demo1/status'>
</location>
<location type='meta' path='/falcon/demo1/meta'>
</location>
</locations>
<ACL owner='falcon' group='falcon' permission='0755'/>
<schema location='none' provider='none'/>
<properties>
<property name='jobPriority' value='HIGH'>
</property>
</properties>
</feed> However everytime I try to run the same I get below exception Error: Feed demo1InputFeed's frequency: months(1), path pattern: FileSystemStorage{storageUrl='${nameNode}', locations=[org.apache.falcon.entity.v0.feed.Location@4b8f5469, org.apache.falcon.entity.v0.feed.Location@189f3fa7, org.apache.falcon.entity.v0.feed.Location@dfc9857]} does not match with group: input's frequency: minutes(1), date pattern: [${MONTH}, ${YEAR}] (FalconWebException:83) Kindly let me know if you can help me in this.
... View more
01-28-2016
06:40 PM
1 Kudo
Thanks for the quick
reply. I just tested
the pig replacing $input and $output with actual HDFS path and pig job is
running fine. Also my feed
has input as path as /falcon/demo1/data/${YEAR}-${MONTH} where as my actual
HDFS path is /falcon/demo1/data/2016-01. Can this be a probable mismatch.
... View more
01-28-2016
05:26 PM
2 Kudos
We have setup a falcon process that reads data from a HDFS location and saves the o/p thru pig process into another HDFS location. The Feeds and Processes are running in the cluster but I cannot see any output generated. My XML for process is as below : <?xml version="1.0"
encoding="UTF-8" standalone="yes"?> <process
name="demo1Process" xmlns="uri:falcon:process:0.1">
<tags>processName=demo1Process</tags> <clusters>
<cluster name="Atlas-Demo1">
<validity start="2016-01-28T20:51Z"
end="2017-02-02T20:51Z"/>
</cluster> </clusters>
<parallel>2</parallel>
<order>FIFO</order>
<frequency>minutes(5)</frequency>
<timezone>GMT+05:50</timezone> <inputs>
<input name="inputfeed" feed="demo1Feed"
start="yesterday(0,0)" end="today(-1,0)"/> </inputs> <outputs>
<output name="outoutfeed" feed="demo1OutputFeed"
instance="yesterday(0,0)"/> </outputs> <workflow
name="select_airlines_data" version="pig-0.12.0"
engine="pig" path="/falcon/demo1/code/demo1.pig"/> <retry
policy="exp-backoff" delay="minutes(3)"
attempts="2"/> <ACL
owner="falcon" group="falcon" permission="0755"/> </process> My XML for Input Feed is as below <feed xmlns='uri:falcon:feed:0.1' name='demo1InputFeed' description='demo1 input feed'>
<tags>feed_name=demo1InputFeed</tags>
<groups>input</groups>
<frequency>minutes(1)</frequency>
<timezone>GMT+05:50</timezone>
<late-arrival cut-off='minutes(3)'/>
<clusters>
<cluster name='demo1cluster' type='source'>
<validity start='2016-01-28T07:49Z' end='2017-02-01T07:49Z'/>
<retention limit='days(2)' action='delete'/>
<locations>
<location type='data'>
</location>
<location type='stats'>
</location>
<location type='meta'>
</location>
</locations>
</cluster>
</clusters>
<locations>
<location type='data' path='/falcon/demo1/data/${YEAR}-${MONTH}'>
</location>
<location type='stats' path='/falcon/demo1/status'>
</location>
<location type='meta' path='/falcon/demo1/meta'>
</location>
</locations>
<ACL owner='falcon' group='falcon' permission='0755'/>
<schema location='none' provider='none'/>
<properties>
<property name='jobPriority' value='HIGH'>
</property>
</properties>
</feed> My Input folder is (in HDFS) /falcon/demo1/data/2016-01
... View more
Labels:
- Labels:
-
Apache Falcon