Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi CRON read mutliple times

Solved Go to solution

NiFi CRON read mutliple times

Contributor

I am using a GetHDFS Processor with CRON driven strategy : sheduled to run every day at 10am.

I have one input file to read but when the dataflow starts it gets the source file multiple times instead of 1 time (9 times in my case). Why?

As a result, when I write the output dataflow, I get the following warning : file with same name already exists

Should I modify the parameter Plling Interval ? (set to 0 sec by default)

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: NiFi CRON read mutliple times

@Raphaël MARY

What does your cron run schedule look like?

View solution in original post

7 REPLIES 7
Highlighted

Re: NiFi CRON read mutliple times

@Raphaël MARY

What does your cron run schedule look like?

View solution in original post

Highlighted

Re: NiFi CRON read mutliple times

Contributor

Run schedule : * * 10 * * ?

Highlighted

Re: NiFi CRON read mutliple times

@Raphaël MARY

Try setting the cron run schedule to 0 0 10 * * ? instead.

The reason the other cron schedule grabbed the same file multiple times is because the * * for second and minutes meant run every second and every minute for that hour.

Highlighted

Re: NiFi CRON read mutliple times

Master Guru

Possibility to run every second or minute. In reality this means run as often as possible using the allowable number of concurrent tasks during the 10th hour of each day. I your case it sounds like it was able to run at least 10 times in that one hour.

Highlighted

Re: NiFi CRON read mutliple times

Hi @Raphaël MARY,

Did you set a different value for number of concurrent tasks?

Are you in a cluster configuration?

Highlighted

Re: NiFi CRON read mutliple times

Contributor

No, only one node and 1 concurrent tasks.

I changed to 0 0 10 * * ? in order to specify minutes and seconds.

It is working now!

Highlighted

Re: NiFi CRON read mutliple times

Master Guru
@Raphaël MARY

If you are running a NiFi cluster, by default every node in your cluster will be running this getHDFS processor at 10 am each day. This means every node will be getting a copy of the same files and processing them in the same way.

If you are running a cluster, considering changing the configuration of your getHDFS processor so it runs on primary node only.

Don't have an account?
Coming from Hortonworks? Activate your account here