Member since
11-16-2015
911
Posts
668
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 705 | 09-30-2025 05:23 AM | |
| 1076 | 06-26-2025 01:21 PM | |
| 931 | 06-19-2025 02:48 PM | |
| 1103 | 05-30-2025 01:53 PM | |
| 12287 | 02-22-2024 12:38 PM |
10-05-2017
06:07 PM
I'm not sure what your custom GetFile does, but the existing GetFile has an option of whether to "Keep Source File", which defaults to false. When set to false, the file will be deleted from the file system once it has been processed by GetFile. If set to true, the source file will remain and thus will be processed again when the GetFile processor is triggered to run again. In your custom processor you may want to support this same behavior for your use case.
... View more
09-16-2017
01:29 AM
1 Kudo
You should be able to use line-by-line as the Evaluation Mode in the ReplaceText processor, using \n or \\n (the slash may need to be escaped, e.g.) as the match value and | or \| as the replace value (again, the latter if it needs to be escaped to be proper regex for the pipe character).
... View more
09-16-2017
01:26 AM
1 Kudo
Try || instead of CONCAT or +, the former is the standard and the latter are not, according to this.
... View more
09-05-2017
10:42 PM
2 Kudos
There could be a couple of things going on here, there is some discussion of each in the thread you mentioned: 1) The X-Pack JAR has multiple dependencies that are not included. When you install the X-Pack plugin into an Elasticsearch node, these dependencies are extracted and added to the ES path so the ES code can find them. In a NiFi node this must be done manually. Check the other thread for the X-Pack ZIP (not JAR), you will need to unzip that somewhere and point to the elasticsearch/ folder underneath that. Your "X-Pack Transport Location" property should be set to a comma-delimited list with two items, one being the transport JAR, and one being the elasticsearch/ subfolder that contains the x-pack JAR and all its dependencies. 2) The Elasticsearch native client (used by all the ES processors that don't end in "Http") is VERY particular about versions, meaning there is no guarantee that the one used by NiFi will be compatible with the ES cluster unless they are the same major and minor versions (I think dot releases -- X.Y.1 or X.Y.2 -- are ok). PutES5 comes with the 5.0.1 client, which means it should work with all ES 5.0.x clusters. However there is no guarantee that it will work with a 5.5.x cluster. In fact I believe Elastic has replaced the native client in 5.5 with a Java one that wraps the REST API. You can try the 5.0.1 X-Pack and Transport JARs (as one person from the other thread did) to see if that works. If you don't require the native client, you may be better served by using PutElasticsearchHttp and enabling TLS/SSL for your Elasticsearch cluster. This (plus setting up access controls for authorization) should give you a robust way to deal with secure Elasticsearch clusters of any version. Also with such an approach you should be able to have X-pack installed on your ES cluster but interact with the cluster from NiFi using the Http versions of the processors; this is how you'd interact with other X-pack capabilities such as Marvel and Watcher. In this case you shouldn't need the X-pack plugin or the transport JAR on the NiFi node, as you won't be using the native client if you use PutElasticsearchHttp.
... View more
08-30-2017
06:03 AM
If you want to include NPM modules, check this link for more details on how to use them with Nashorn.
... View more
08-29-2017
03:31 PM
1 Kudo
Where did you get ImportSqoopFull? To my knowledge that processor is not in Apache NiFi nor HDF NiFi. Wherever you got that processor, hopefully there is some documentation/code there to help you with your issues.
... View more
08-29-2017
02:27 PM
You may be better served with ExecuteStreamCommand rather than ExecuteProcess for this case. You could schedule a GenerateFlowFile at the same rate your ExecuteProcess was scheduled for, and set Ignore STDIN to true in ExecuteStreamCommand. Then the outgoing flow files will have the execution.status attribute set, which you can use with RouteOnAttribute to handle failures (non-zero exit codes, e.g.) If you must use ExecuteProcess, perhaps you could run your shell command followed by a double-bar and a command that prints something you can check for later, such as: myCommand || echo "!ERROR!" I haven't tried this so I don't know if that would work, but if it does it would allow you to use RouteOnContent to check for that error string to indicate failure. The same technique works without the || if you know what to look for in your failed command output.
... View more
08-24-2017
06:25 PM
In QueryDatabaseTable, you'd set the Maximum-Value Column to "id" and add a dynamic property named "initial.maxvalue.id" to 50. Make sure state has been cleared before running, and the first time it executes, it will grab all rows with id > 50. This same capability for GenerateTableFetch is not yet available (NIFI-4283) but coming soon.
... View more
08-24-2017
02:21 PM
Since you're using a script, you could feasibly replace steps 4-7 above, but my blog post (plus adding your attribute) really only covers 5&6
... View more
08-22-2017
03:10 PM
Step 3 (SplitJson) is kind of a "barrier" processor, it will send out all flow files after they have been split (in order to add fragment.count to each flow file). Then steps 4&5 must be executing so quickly that each flow file gets the same query_startTime value. It sounds like you'd like to set query_startTime just as ExecuteSQL is about to execute the statement. Unfortunately I don't believe this is possible with ExecuteSQL (although please feel free to write a Jira to add this improvement). If you are comfortable with a scripting solution such as Groovy, check out my blog post on how to do SQL using Groovy using the ExecuteScript processor in NiFi. You could modify that to add the query_startTime attribute just before calling the sql.rows() statement in the script
... View more