Support Questions

Find answers, ask questions, and share your expertise

How to create falcon entity dependencies?

avatar
Master Guru

According to falcon documentation:

Dependency

Returns the dependencies of the requested entity. Dependency list include both forward and backward dependencies (depends on & is dependent on). For example, a feed would show process that are dependent on the feed and the clusters that it depends on.

How are dependencies created through Falcon UI?

1 ACCEPTED SOLUTION

avatar

@Sunile Manjee The dependencies are derived based on the entity description, once you create those entities using Falcon (UI or CLI).

So for e.g., you define your cluster in the cluster entity xml, you specify the name..

<cluster colo="location1" description="primaryDemoCluster" name="primaryCluster" xmlns="uri:falcon:cluster:0.1"> 

When you define this cluster in a feed entity, the dependency gets created when you create the feed entity..

<feed description="Demo Input Data" name="demoEventData" xmlns="uri:falcon:feed:0.1">
    <tags>externalSystem=eventData,classification=clinicalResearch</tags>
    <groups>events</groups>
    <frequency>minutes(3)</frequency>
    <timezone>GMT+00:00</timezone>
    <late-arrival cut-off="hours(4)"/>
    <clusters>
        <cluster name="primaryCluster" type="source">
            <validity start="2015-08-10T08:00Z" end="2016-02-08T22:00Z"/>
            <retention limit="days(5)" action="delete"/>
        </cluster>
    </clusters>

The same concept applies to processes to feed dependencies..

Take a look at this example for working set of falcon entities - https://github.com/sainib/hadoop-data-pipeline/tree/master/falcon

View solution in original post

1 REPLY 1

avatar

@Sunile Manjee The dependencies are derived based on the entity description, once you create those entities using Falcon (UI or CLI).

So for e.g., you define your cluster in the cluster entity xml, you specify the name..

<cluster colo="location1" description="primaryDemoCluster" name="primaryCluster" xmlns="uri:falcon:cluster:0.1"> 

When you define this cluster in a feed entity, the dependency gets created when you create the feed entity..

<feed description="Demo Input Data" name="demoEventData" xmlns="uri:falcon:feed:0.1">
    <tags>externalSystem=eventData,classification=clinicalResearch</tags>
    <groups>events</groups>
    <frequency>minutes(3)</frequency>
    <timezone>GMT+00:00</timezone>
    <late-arrival cut-off="hours(4)"/>
    <clusters>
        <cluster name="primaryCluster" type="source">
            <validity start="2015-08-10T08:00Z" end="2016-02-08T22:00Z"/>
            <retention limit="days(5)" action="delete"/>
        </cluster>
    </clusters>

The same concept applies to processes to feed dependencies..

Take a look at this example for working set of falcon entities - https://github.com/sainib/hadoop-data-pipeline/tree/master/falcon