Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4159 | 12-03-2018 02:26 PM | |
3118 | 10-16-2018 01:37 PM | |
4248 | 10-03-2018 06:34 PM | |
3072 | 09-05-2018 07:44 PM | |
2333 | 09-05-2018 07:31 PM |
03-02-2018
02:53 PM
1 Kudo
I've answered this on stackoverflow... https://stackoverflow.com/questions/49059136/nifi-java-lang-nosuchmethoderror-org-apache-hadoop-conf-configuration-reloadexi
... View more
03-01-2018
07:37 PM
Is it possible that your content does not contain any new line characters like \n or \r? I'm wondering if the way ReplaceText works it might be using new-lines and if it doesn't encounter any then it ends up loading the whole content which is not what we want.
... View more
03-01-2018
05:48 PM
1 Kudo
Is ReplaceText configured as shown above with Line-By-Line and a 1MB buffer?
... View more
03-01-2018
04:33 PM
Ok GetFile -> ReplaceText -> PutFile should be fine, it will just take a long time 🙂 GetFile will stream file from source location to NiFI's internal repository ReplaceText will read line-by-line from content repo, and write line-by-line back to content repo PutFile will stream from content repo to local disk
... View more
03-01-2018
02:53 PM
You don't necessarily need a heap larger than the file unless you are using a processor that reads the entire file into memory, which generally most processors should not do unless absolutely necessary, and if they do then they should document it. In your approach of "list-->fetch-->splittext-->replacetext-->mergecontent" the issue is that you are splitting a single flow file into millions of flow files, and even though the content of all these flow files won't be in memory, its still millions of Java objects on the heap. Whenever possible you should avoid this splitting approach. You should be using the "record" processors to manipulate the data in place and keep your 22GB as a single flow file. I don't know what you actually need to do to each record so I can't say exactly, but most likely after your fetch processor you just need an UpdateRecord processor that would stream 1 record in, update a field, and stream the record out, so it would never load the entire content into memory, and would never create millions of flow files.
... View more
02-28-2018
03:18 PM
There is also a GrokReader for record processors, and the additional details docs of that has the default pattern's it uses. See the end of this page: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.5.0/org.apache.nifi.grok.GrokReader/additionalDetails.html
... View more
02-21-2018
01:33 PM
Have you added any additional JARs to NiFI's lib directory? This looks like something is on the classpath that shouldn't be.
... View more
02-21-2018
01:31 PM
1 Kudo
In authorizers.xml you have "Initial User Identity 1" and "Initial User Identity 2" for your two node identities, you need to add another one for your initial admin. You may need to delete users.xml and authorizations.xml before trying again, in case they are already created in a bad state.
... View more
02-12-2018
04:54 PM
2 Kudos
Ok, how about ExecuteStreamCommand which accepts incoming flow files?
... View more
02-11-2018
03:40 PM
As far as I know, none of the HTTP processors in NiFi support Kerberos authentication, so I don't think you'll be able to do the first idea. For the second idea, you should be able to use the ExecuteProcess processor, the command arguments property supports expression language.
... View more