About JoeWitt

JoeWitt · ‎04-13-2018

Since the data coming in is JSON concatenated together with newlines the first thing you can do is use SplitText to split on newlines. Then the resulting objects are JSON documents at which point in NiFi you can use all kinds of fun processors. The suggestion above to use MergeContent *I think* would head down the opposite path from what I believe the question was but perhaps I understood it incorrectly.

JoeWitt · ‎02-27-2018

There are no real additional reasons. The key reason for ListenSyslog is that it understands syslog message framing on top of a raw TCP socket. Otherwise, it is basically not much different than ListenTCP or ListenUDP.

JoeWitt · ‎02-27-2018

@Hemantha kumara We have a docker container available as you note and it works fine as a way to launch a single node NiFi. We dont have any published kubernetes specific configurations at this time but to your question of do we plan to support that - yes we do. We cannot commit to any kind of timelines in this forum or even that we'll definitely do it but it is safe to say it is a direction of high interest.

JoeWitt · ‎02-27-2018

Network issues can certainly be a factor. However, you might also want to ensure you use the precise Kafka client for the given Kafka broker version. Since you're on Kafka 0.11 you might want NiFi 1.5.0 or HDF 3.1.0 which has support for that directly in ConsumeKafka_0_11

JoeWitt · ‎02-27-2018

You'll want to review the heap dump further but it sounds highly likely that FlowFile objects are being built up in large numbers (10s/100s of thousands+) within the ProcessSession which can quickly take up a lot of heap. The heap storage for that is not the content of the flowfiles but is the attributes and that can still add up fast. You want to ensure you're committing the session frequently to move those along and not have the session have too many flowfiles being tracked at once. Now, having said that you should also consider doing ConsumePulsarRecord instead of just ConsumePulsar, for example. Doing it via the record model will dramatically outperform doing it otherwise unless you do your own raw record framing such as newlines or something where a single flowfile represents many records at once. If it is a single record is a single flowfile then just be sure you're committing the session frequently.

JoeWitt · ‎02-27-2018

You can use ListSFTP -> FetchSFTP -> PutFile flow on NiFi to grab the files from wherever you're storing them for the master copy of the configs. This will have NiFi keep itself up to date and you can have your hadoop resources point at the location where you do the PutFile.

JoeWitt · ‎02-27-2018

I believe you'll want to run the HDF build of NiFi which has libraries tailored to work with HDP Hive.

JoeWitt · ‎02-26-2018

Hello there Chris. While not a direct answer to your question the community made the NiFi 1.x release line available in Aug 2016. In the latest release on the 1.x line (1.5.0) the community introduced the Apache NiFi Registry. This provides a really powerful and well integrated way to have versioned flows stored in a central registry which you can use to have good SDLC behaviors from dev, to staging, to prod and which handles things like sensitive properties and processor group level variables well. It also lets you have nested versioned groups which is really useful for multi-tenant/team cases.

JoeWitt · ‎06-21-2017

Any component, custom or not, which is not responding in a timely manner to lifecycle calls such as when it is unscheduled will do this. I've seen it quite a bit lately as well. We should consider listing thread titles or something that have it as that will help spot the culprit pretty quickly.

JoeWitt · ‎04-04-2017

are any updates working or is the only one being tried?

Online	Offline
Last Visited	‎11-07-2019 09:38 AM

Member Since	‎07-30-2019 09:22 AM
Last Visited	‎11-07-2019 09:38 AM
Posts	105
Kudos received	129

Cloudera Community

Re: ListenTCP or ListenSyslog

Re: Is there Kubernetes config available for Nifi ...

Re: Nifi ConsumeKafka_0_10 error

Re: Best practice for updating Nifi's 'Hadoop Conf...

Re: NIFI HiveStreaming error

Re: Modifying JSON file

Re: ListenTCP or ListenSyslog

Re: Is there Kubernetes config available for Nifi ...

Re: Nifi ConsumeKafka_0_10 error

Re: How to find a memory leak inside a custom NiFi...

Re: Best practice for updating Nifi's 'Hadoop Conf...

Re: NIFI HiveStreaming error

Re: Problem uploading a 1.4MB template to Nifi

Re: NiFi - restarting a node gracefully

Re: NiFi PutSQL exception.