About johnmteabo

david686 · ‎11-09-2022

can someone explain me what is the meaning of dot-rename ? i'm using a putsftp processor for one of my flows and i'm getting an error about dot-rename

MattWho · ‎04-13-2020

@Aminsh I am not sure where your response fits in to this thread. Are you asking a new question here? I recommend you start a new thread if that is the case. Thanks, Matt

MattWho · ‎06-13-2018

https://community.hortonworks.com/content/kbentry/109629/how-to-achieve-better-load-balancing-using-nifis-s.html

MattWho · ‎10-29-2018

@Bobby Harsono - Some processor may be designed to utilize memory outside of the JVM. Some of the scripting processor like ExecuteProcess or ExecuteStreamCommand are a good examples. They are calling a process or script external to NiFi. Those externally executed commands will have a memory footprint of their own. - Listen type processors like ListenTCP or ListenUDP is another example. These have memory footprints both inside and outside the NiFi JVM heap space. These processors can be configured with socket buffer which is created outside of heap space.- - Thanks, Matt

sthompson · ‎05-07-2018

@John T I have recently built out an HDF environment for a Fortune 1 retail company to handle 1-2k connections per node and move an average of 1-1.5TB a day. We utilized the HandleHTTP processors as MiNiFi was not an option at project conception. If you are using the HandleHTTPRequest/Response processors, note that there is a bug which causes objects to not be released correctly causing heap utilization to climb in a linear fashion. Our workaround was to utilize the API to stop/start the HandleHTTPRequest processor when the heap reached 70%. This bug was corrected in the 1.6 release of NiFi but has not been rolled up into an HDF release since I last checked. So, handling that kind of volume will cause the same scenario in your situation. If you can use ListenHTTP (or MiNiFi as Matt suggested), you should be fine. We were utilizing external load balancers as we were running three clusters in separate data centers. The plan in the next phase is to start utilizing MiNiFi in the edge environments and point the different systems feeding data into HDF at those MiNiFi HTTP listeners. If you are running a single cluster, as Matt mentioned, that would load balance for you.

hodgkinsonjeffr · ‎03-09-2018

@Matt Burgess, @John T: I got this working in Python, my first ever such program ,so it might be rough around the edges. The nifi processor expects input from listFile, not getFile, as it uses zipfile, which wants a file to read. The code: import zipfile from org.apache.nifi.processor.io import InputStreamCallback class ReadVersion(InputStreamCallback) def __init__(self): self.ff = None self.version = '' self.error = '' def process(self,inputStream): try: zipname = self.ff.getAttribute('filename') zippath = self.ff.getAttribute('absolute.path') zfile = zipfile.ZipFile(zippath+zipname) for name in zfile.namelist(): if (name == 'docProps/app.xml'): inFile = zfile.open(name) inContents = infile.read() loc = inContents.find('<AppVersion>1') if (loc != -1): keyChar = inContents[loc+13:loc+14] if (keyChar == '2'): self.version = '2007' elif (keyChar == '4'): self.version = '2010' elif (keyChar == '5'): self.version = '2013' elif (keyChar == '6'): self.version = '2016' else: log.warn('Unexpected AppVersion value: ',inContents[loc+12:loc+14]) except: log.warn('exception thrown (is this really a zip file?)') self.error = 'error' ff = session.get() if (ff != None): callback = ReadVersion() callback.ff = ff session.read(ff, callback) if (callback.version != ''): ff = session.putAttribute(ff,'MSVersion',callback.version) session.transfer(ff, REL_SUCCESS) if (callback.error == 'error'): session.transfer(ff, REL_FAILURE)

johnmteabo · ‎02-19-2018

Yes thank you matt!

johnmteabo · ‎09-08-2017

That's exactly what we needed! Thank you!!!

MattWho · ‎04-21-2017

@John T Sounded a lot like a back pressure scenario to me when you first described what was going on. Glad you were able to resolve you issue. I also saw your other post and commented on it.

MattWho · ‎04-20-2017

@John T If you are using the listSFTP processor before your FetchSFTP processor , it will produce a zero byte flow flowfile for every FlowFile it finds on the target SFTP server. The listSFTP processor has a "File Filter Regex" where you can specify a java regular expression to limit what is returned to just files containing "file123.txt". For example "*file123.txt" The ListSFTP processor also maintains state so that the same files are not listed each time. so only new files containing file123.txt are listed each time it runs. The FetchSFTp processor is designed to return the content of a specific file and insert it as content to the FlowFile that he FetchSFTP processor is running against. Thanks, Matt

Online	Offline
Last Visited	‎03-04-2019 02:48 PM

Member Since	‎10-08-2016 04:38 PM
Last Visited	‎03-04-2019 02:48 PM
Posts	59
Kudos received	16

Cloudera Community

Re: nifi putfile dot rename?

Re: NiFi Clustering Issue ConnectionLoss Error

Re: GETSFTP with NiFi cluster

Re: Best NiFi Heap usage performance for Large Ser...

Re: 40 Gbps NiFi Cluster

Re: Unzip files in ExecuteScript NiFi processor

Re: PutTCP-ListenTCP NiFi to NiFi issue

Re: Ideas for In Order Processing in NiFi

Re: RouteOnAttribute Processor will not Process da...

Re: Does the FetchSftp Processor support wildcards...