Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Nifi how to cron scrollelasticsearchhttp processor ?

Highlighted

Nifi how to cron scrollelasticsearchhttp processor ?

New Contributor

Hello,


I have a batch that must run every day at 2:00. This batch needs to process a big volume of data (the data of the previous day : about 100 000 elastic documents).

For these 2 reasons, i use :
- a "ScrollElasticSearchHttp" with a query that filters the data of the day before (see below) and a page size = 1000

- a cron that launches the processor above : */5 * 2 * * ?

108603-sans-titre.png

I have no problem for the first day : the scroll iterates according the cron (a call every 5s) and retrieves all the pages.

The problem is for the following day : I have a 404.


I think the 404 is caused by the removing of the scroll context after 1mn of inactivity.


I have tried to increase the scroll duration (eg : 1 day) I have no 404 but I can't retrieve the new values (because the query seems to be based on the initial state).


My question : is there someting wrong in my configuration ? is there a way to do the job : cron a batch that retrieves - using a scroll process - the data of the previous day ?


Thanks by davance.

Don't have an account?
Coming from Hortonworks? Activate your account here