Member since
05-09-2019
3
Posts
0
Kudos Received
0
Solutions
06-13-2019
01:01 PM
Thank you Shu ! It seems to solve my problem. I can't test it now because of an error when retrieving the state("Failed to obtain value from ZooKeeper for component with ID 8a2d3c3d-5eb5-1331-8be1-6f5fda823456 with exception code CONNECTIONLOSS"). But it's on my side now. Thanks again.
... View more
06-12-2019
11:31 AM
Is there a way to clear the state of a processor using another processor or groovy scripting (in an ExecuteStateProcessor) ? The goal is to reset the state of a ScrollElasticSearchHttp processor to be able to run it more than once. Important. I don't want do it manually using the UI ("state management" then "clear state", as it is explained in https://community.hortonworks.com/questions/64405/nifi-clear-state.html) I want this operation be done after a scroll execution. Refer the nifi doc : https://help.syncfusion.com/data-integration/processors/scrollelasticsearchhttp ... Scrolls through an Elasticsearch query using the specified connection properties. This processor is intended to be run on the primary node, and is designed for scrolling through huge result sets, as in the case of a reindex. The state must be cleared before another query can be run. ... But how do that ? Thanks by advance.
... View more
Labels:
- Labels:
-
Apache NiFi
05-10-2019
05:09 AM
Hello, I have a batch that must run every day at 2:00. This batch needs to process a big volume of data (the data of the previous day : about 100 000 elastic documents). For these 2 reasons, i use : - a "ScrollElasticSearchHttp" with a query that filters the data of the day before (see below) and a page size = 1000 - a cron that launches the processor above : */5 * 2 * * ? I have no problem for the first day : the scroll iterates according the cron (a call every 5s) and retrieves all the pages. The problem is for the following day : I have a 404. I think the 404 is caused by the removing of the scroll context after 1mn of inactivity. I have tried to increase the scroll duration (eg : 1 day) I have no 404 but I can't retrieve the new values (because the query seems to be based on the initial state). My question : is there someting wrong in my configuration ? is there a way to do the job : cron a batch that retrieves - using a scroll process - the data of the previous day ? Thanks by davance.
... View more
Labels:
- Labels:
-
Apache NiFi