Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

[RE-OPEN] FALCON - Feed entity FeedPAth

Solved Go to solution
Highlighted

[RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

Hi all,

It seems that data path should like if frequency feed is "hours(2)" :

/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}

My question is : all the paths need to be create before on primary and backup cluster ?

/tmp/falcon/next-vers-current/2016/05/26/13/
/tmp/falcon/next-vers-current/2016/05/26/14/
/tmp/falcon/next-vers-current/2016/05/26/15/
1 ACCEPTED SOLUTION

Accepted Solutions

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Expert Contributor

@mayki wogno Thanks for sharing the feed replication entity xml. I have looked around your entity and found that the exception occurred as location type data path is not defined with frequency.

<location type="data"path="/tmp/falcon/"/>

Can you define path as follows: path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"

I am hoping once you define with frequency this will work.

Also in the shared entity I am seeing that you have specified the same HDFS path for source and target. Can you please check this as well.

View solution in original post

9 REPLIES 9
Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Expert Contributor

@mayki wogno To answer your question, atleast frequency based data feed must be available on primary cluster to copy the data on backup cluster periodically through scheduled feed replication. If data is not available on primary cluster, then scheduled instance will be in waiting state for data availability.

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

It seems what my question is not clear :

I want to submit this feed :

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="next-vers-current" description="next-vers-current" xmlns="uri:falcon:feed:0.1">
 <frequency>hours(6)</frequency>
 <timezone>UTC</timezone>
 <clusters>
 <cluster name="next-rec-cluster" type="source">
 <validity start="2016-05-01T12:00Z" end="2016-05-27T23:00Z"/>
 <retention limit="hours(2)" action="delete"/>
 <locations>
 <location type="data" path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
 </locations>
 </cluster>
 <cluster name="current-rec-cluster" type="target">
 <validity start="2016-05-01T12:00Z" end="2016-05-27T23:00Z"/>
 <retention limit="days(2)" action="delete"/>
 <locations>
 <location type="data" path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
 </locations>
 </cluster>
 </clusters>
 <locations>
 <location type="data" path="/tmp/falcon/"/>
 <location type="stats" path="/none"/>
 <location type="meta" path="/none"/>
 </locations>
 <ACL owner="falcon" group="hadoop" permission="0755"/>
 <schema location="/none" provider="none"/>
 <properties><property name="queueName" value="oozie-launcher"/></properties>
</feed>

falcon entity -type feed -submit -file next-vers-current.xml
ERROR: Bad Request;default/org.apache.falcon.FalconWebException::org.apache.falcon.FalconException: Feeds default path pattern: ${nameNode}/tmp/falcon, does not match with cluster: next-rec-cluster path pattern: hdfs://master001.next.rec.mapreduce.m1.p.fti.net:8020/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}

So my question, It is normal that i need create all paths with this extension ?

${YEAR}/${MONTH}/${DAY}/${HOUR}

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Expert Contributor

@mayki wogno Thanks for sharing the feed replication entity xml. I have looked around your entity and found that the exception occurred as location type data path is not defined with frequency.

<location type="data"path="/tmp/falcon/"/>

Can you define path as follows: path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"

I am hoping once you define with frequency this will work.

Also in the shared entity I am seeing that you have specified the same HDFS path for source and target. Can you please check this as well.

View solution in original post

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

@peeyush : What's difference between 'location data path' in cluster section and feed section ?

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Expert Contributor

@mayki wogno 'location data path' in feed section is initial source data path, which can be overridden by 'location data path' if defined in source cluster section.

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

@peeyush: so why in ma case the 'location data path' in feed section rise an alert ? As you said 'location data path' in section cluster overrriden on.

Nevermind, now i put the same path in all sections, now submit and schedule are OK.

Thanks all.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="next-vers-current" description="next-vers-current" xmlns="uri:falcon:feed:0.1">
 <frequency>hours(2)</frequency>
 <timezone>UTC</timezone>
 <clusters>
 <cluster name="next-rec-cluster" type="source">
 <validity start="2016-05-27T14:00Z" end="2016-05-28T23:00Z"/>
 <retention limit="hours(6)" action="delete"/>
 <locations>
 <location type="data" path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
 </locations>
 </cluster>
 <cluster name="current-rec-cluster" type="target">
 <validity start="2016-05-01T14:00Z" end="2016-05-28T23:00Z"/>
 <retention limit="days(2)" action="delete"/>
 <locations>
 <location type="data" path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
 </locations>
 </cluster>
 </clusters>
 <locations>
 <location type="data" path="/tmp/falcon/next-vers-current/${YEAR}/${MONTH}/${DAY}/${HOUR}"/>
 <location type="stats" path="/none"/>
 <location type="meta" path="/none"/>
 </locations>
 <ACL owner="falcon" group="hadoop" permission="0755"/>
 <schema location="/none" provider="none"/>
 <properties><property name="queueName" value="oozie-launcher"/></properties>
</feed>

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Expert Contributor

@mayki wogno Earlier 'location data path' in your feed section raised an alert as Frequency " ${YEAR}/${MONTH}/${DAY}/${HOUR}" was missing from data path.

Thanks for confirming that it works now for you.

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

@peeyush as said in my last comment, regarding my news feed-replication.xml it works now. Thanks.

Highlighted

Re: [RE-OPEN] FALCON - Feed entity FeedPAth

Explorer

Hi again,

There is something weird in the workflow FALCON_FEED_RETENTION , the feedDataPath is wrong

feedDataPath    
        DATA=hdfs://clusterA:8020/tmp/falcon/next-vers-current/?{YEAR}/?{MONTH}/?{DAY}/?{HOUR}

for FALCON_FEED_REPLICATION, the feedDataPath is correct :

distcpSourcePaths
                  hftp://clusterA:50070/tmp/falcon/next-vers-current/2016/05/27/12
distcpTargetPaths
		hdfs://clusterB/tmp/falcon/next-vers-current/2016/05/27/12/

What's wrong in my feed-replication.xml ?

Don't have an account?
Coming from Hortonworks? Activate your account here