I am doing a oozie-cooridnator when input data with dynamic name is available. Here is the coordinator.xml:
<coordinator-app name="${jobName} Coordinator" frequency="${coord:days(1)}" start="${startTime}" end="2099-01-01T00:00Z" timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
<datasets>
<dataset name="gaSchema" frequency="30" initial-instance="${startTime}" timezone="UTC">
<uri-template>${nameNode}/ga/bySchema/</uri-template>
<done-flag>ga_${YEAR}${MONTH}${DAY}.avro</done-flag>
</dataset>
</datasets>
<input-events>
<data-in name="coordInput1" dataset="gaSchema">
<start-instance>${coord:current(-23)}</start-instance>
<end-instance>${coord:current(0)}</end-instance>
</data-in>
</input-events>
<action>
<workflow>
<app-path>${wfApplicationPath}</app-path>
<configuration>
<property><name>date</name><value>${coord:formatTime(coord:nominalTime(), "yyyyMMdd")}</value></property>
<property><name>jobTracker</name><value>${jobTracker}</value></property>
<property><name>nameNode</name><value>${nameNode}</value></property>
<property><name>jobName</name><value>${jobName}</value></property>
</configuration>
</workflow>
</action>
</coordinator-app>When the file with current date arrives a hdfs folder then trigger workflow.
<done-flag>ga_${YEAR}${MONTH}${DAY}.avro</done-flag>It didn't work with dynamic name. I search it on internet, it seems it works on dynamic folder with fixed file name. for example:
<uri-template>${nameNode}/ga/bySchema/${YEAR}${MONTH}${DAY}</uri-template>
<done-flag>ga.avro</done-flag>In this case, I have to create a lot of folders on hdfs because we import data every day.
Do you have any ideas how to trigger oozie workflow when input data with dynamic name is available?
Thanks