Member since
07-12-2017
53
Posts
3
Kudos Received
0
Solutions
05-30-2019
01:11 PM
Hi @Andrew Lim , Thanks for a detailed explaination. Following your article i am trying to convert a csv to json using convertrecord processor and then load the merged json (output of convertrecord) to redshift using copy from a file.my merged json is stored in s3.I am getting error that csv is not in json format, could you please suggest how to load these records all at once to redshift?
... View more
04-10-2019
07:35 AM
This doesn't works for me, i places a flow.xml.gz from dev to prd cluster,cleared all repsotiories of prod but still i see state in processors.Could you please suggest other way to clear state for all processors at one go ? i tried deleting state folder contents under /nifi/conf but that too dint help,it gave me some error.
... View more
04-08-2019
03:08 PM
Hi @mattburgess i am using the same processor for fetching incremental data from relational tables.i have given max rows fetch size as 500 and max value column as a timestamp. Is fetching data in batch can lead to data loss, as i have seen few records of some timestamp are not being fetched when doing incremental run but are fetched when i clear state and run full load? want to understand working of max rows feature. read your comment regarding max fragment setting on this blog https://community.hortonworks.com/questions/178505/querydatabasetable-processor-shutting-down.html , is the same applicable for max row fetch size too?Please suggest
... View more
04-03-2019
01:59 PM
Thanks Matt for your view on this,the ask is to generate a batchid which should be a sequence number, so whenever querydb processor fetches records from source db (sqlserver) a batchid should be added to the flowfile so that all records have a same batchid when loaded to target table,this will help in auditing of records.but here in cluster mode it seems difficult to achieve this using updateattribute processor.i liked your idea of appending node hostname with the sequence but if i could generate atomic values across all nodes it would be much better.
... View more
04-03-2019
01:55 PM
Thanks David, Idea looks good ,I will try this.
... View more
03-22-2019
04:50 AM
@shu,@Mattclarke,@markpayne How do i generate the sequence number to be used as a stored value as you suggested.As per my knowledge there is only one processor in nifi to generate sequence number and that is update attribute which in cluster mode will again produce different values across all nodes.
... View more
03-20-2019
03:16 PM
Hi All, @mattclarke,@mattburgess,@markpayne I want to generate sequence number in my nifi cluster (3 nodes), I was using update attribute processor with store state locally option , but this is not serving my purpose as each node is generating its own value incrementally and this is creating duplicate values while loading data to target table.I would be grateful if i can get alternate solution to achieve this batchid generation in cluster mode. Thanks in advance!!
... View more
Labels:
- Labels:
-
Apache NiFi
03-20-2019
02:02 PM
@Matt Clarke,@matt burgess Exactly the second point is happening, each node is generating its own value incrementing from last value it has stored in its local state. So which processor or method should i use to generate an incremental batchid (batch1,batch2...so on) since update attribute is messing values when running on cluster. or is there any property by which updateattribute processors on all nodes can pickup each others's last state variable?..please suggest
... View more
03-20-2019
06:45 AM
@Shu will this work even if we have some state alraedy stored withing the processor? for eg: i have a timestamp (2019-03-17 02:00:00:0) stored in the state of my processor now i want the processor to start fetching data after 2019-03-20, will this property help in such scenario?
... View more
03-20-2019
06:39 AM
@mattburgess,@markpayne Hi All, I am using stateful variables to generate an incremental batchid value using update attribute processor,this runs in cluster and have set the processor to run on all nodes. But the batchid values generated are not in an incremental fashion,the processor is missing some values sometime or generating a duplicate value.Is it happening due to restart of the cluster and wiping off of stateful variables data? could you please suggest how can i persist stateful variables data? attached update attribute configuration for reference. Please s uggest!!
... View more
Labels:
- Labels:
-
Apache NiFi