Member since
04-13-2018
44
Posts
0
Kudos Received
0
Solutions
02-13-2022
11:14 PM
1 Kudo
@yamaga, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
01-10-2019
03:20 PM
@Gillu
Varghese
I would inspect your content repository to see if the referenced claim (StandardResourceClaim[id=XXX, container=default, section=490], offset=0, length=190]) still exists. Within content_repository directory, I would look for sub-folder "490". Then within that folder look for file named XXX (assuming you replaced actual claim number here with XXX) - It sounds like this file my have been deleted. Do you have some external process that may be access your content repository? Was the content repository maybe moved? Do you maybe have multiple node NiFi cluster where every node is trying to share same mounted content repo? Was NIFi restarted as different user? This could result in some files in repo being owned by different users which may lead to permissions issues access those files. - FlowFiles are what move from processor to processor. The FlowFile metadata (stored in flowfile repository) includes information on the size and location of the physical content within one of the content repositories (default in this case). In this case, the FlowFile has reached a processor where actually retrieving that content was needed, but could not be found. - Thank you, Matt
... View more
01-16-2019
01:55 PM
@Gillu
Varghese
Have you considered upgrading to NiFi 1.8 to take advantage of the load distribution capability of connections? I am assuming you script is executing on each node in your cluster? So the script is essentially looking for 50 flowfiles on each node which would explain why it just sits there. I am not a groovy script writer, so i am of little help there. --- The only other option that comes to mind is incrementing a value in a distributedMapCache server per node. Then have a side flow that constantly checks the sum of those cache values until it equals 50. That flow then notifies all 50 files were written and resets the per node cache values back to zero. Processors Flow 1: --> PutSFTP ---> FetchDistirbutedMapCache (get current stored value for node) --> ReplaceText (Replace content with retrieved value +1) ---> PutDistributedMapCache (write new value to cache) Flow 2: GenerateFlowFile (primary node only) ---> FetchDistributedMapCache (x3 to retreive stored cache value for each node) --> RouteOnAttribute (add relationship for when sum of all cache values equals 50, terminate unmatched) --> PutEmail (notification) --- Thanks, Matt
... View more
12-03-2018
04:13 PM
@Gillu
Varghese
A few questions: 1. Are you sure all 136 files are reaching the MergeContent processor's inbound connection within 5 minutes? The bin age starts when very first FlowFile is added to a bin. At 5 minutes from that point the bin will be merged even if not all 136 have arrived. 2. Is your NiFi a cluster or standalone instance of NiFi? If cluster, are all 136 FlowFiles on same NiFi node? Each node in a cluster can only merge FlowFiles residing on same Node. There is a new load balanced connection feature in NiFi 1.8 that can help here if this is the case. https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster - Try setting your max bin age to a much higher value and see what results you see. - Thank you, Matt
... View more
10-26-2018
11:57 AM
@Matt BurgessThere is only single task and its not importing the CPython libraries.This job was working fine for 6 months now and all of a sudden it started failing with the error above in the executescript processor.
... View more
08-14-2018
04:28 AM
@Gillu
Varghese
Both cron triggers in the screenshot are same you can use either of them for scheduling purpose. We cannot trigger just at 3AM, the largest time that we can trigger is at 2:59:59AM with one cron expression.
... View more
08-08-2018
12:20 PM
@Gillu
Varghese
if you are using Method1: Then ListFIle processor output's flowfile with filename attribute associated with it. In Method2: We are listing out the files in the directory then preparing filename attribute to the flowfile. Then Fetch File Processor Fetches the files in the directory. In both methods works if the filename keep changing also, as we are dynamically adding filename attribute to the flowfile.
... View more
04-27-2018
06:11 AM
@Matt Clarke Thanks Matt for the information..and helping out...it worked
... View more
04-27-2018
12:38 PM
@Gillu
Varghese
Keep in mind how JVM heap space works. At a very high level, objects in heap are not cleared out when no longer used. So a FlowFile's attributes while queued will exist in heap, when that FlowFile's no longer exists in flow (reached end of flow for example) that heap space is likely to still be occupied. It is the job of Java Garbage Collection (GC) to free unused heap space. So once heap utilization is high enough that free space is needed by the JVM, GC will run to create that free space. - So even after running a heavy flow and no FlowFiles are left anywhere in your dataflows, you may still observe high reported heap usage. That is normal and expected. - Thanks, Matt - If you found this answer addressed your original question, please take a moment to login and click "accept".
... View more