Support Questions
Find answers, ask questions, and share your expertise

Apache Falcon retention policy

Apache Falcon retention policy


I want to use Apache Falcon to clean up my cluster regarding some basic rules of retention (like "I want to delete all the data older than XX days in this HDFS path").

I have been able to create my cluster and feeds : I created 1 feed for each path concerned by a retention policy.

As a result I can see Oozie running the jobs regulary in the cluster, but the retention policy seems not to be applied as it doesn't delete anything in the given path (feeds).

However, jobs are ending with a "SUCCEEDED" status which means everythings worked well...

Anyone having experienced this kind of issue ? I don't know what's wrong, maybe I did not understand well the purpose of "retention" in Apache Falcon.

Attached is one of the feed example :

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="TEST-1H" version="0" xmlns="uri:falcon:feed:0.1">
    <late-arrival cut-off="minutes(12)"/>
        <cluster name="bdtest" type="source" version="0">
            <validity start="2017-11-24T15:17Z" end="2099-12-31T11:59Z"/>
            <retention limit="hours(1)" action="delete"/>
                <location type="data" path="/test/${YEAR}/${MONTH}/${DAY}"/>
                <location type="stats" path="/"/>
        <location type="data" path="/test/${YEAR}/${MONTH}/${DAY}"/>
        <location type="stats" path="/"/>
    <ACL owner="admin" group="users" permission="0x755"/>
    <schema location="/none" provider="/none"/>
        <property name="queueName" value="default"/>
        <property name="jobPriority" value="NORMAL"/>