I have a batch that must run every day at 2:00. This batch needs to process a big volume of data (the data of the previous day : about 100 000 elastic documents).
For these 2 reasons, i use :- a "ScrollElasticSearchHttp" with a query that filters the data of the day before (see below) and a page size = 1000
- a cron that launches the processor above : */5 * 2 * * ?
I have no problem for the first day : the scroll iterates according the cron (a call every 5s) and retrieves all the pages.
The problem is for the following day : I have a 404.
I think the 404 is caused by the removing of the scroll context after 1mn of inactivity.
I have tried to increase the scroll duration (eg : 1 day) I have no 404 but I can't retrieve the new values (because the query seems to be based on the initial state).
My question : is there someting wrong in my configuration ? is there a way to do the job : cron a batch that retrieves - using a scroll process - the data of the previous day ?
Thanks by davance.