Member since
11-16-2015
911
Posts
668
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 704 | 09-30-2025 05:23 AM | |
| 1076 | 06-26-2025 01:21 PM | |
| 931 | 06-19-2025 02:48 PM | |
| 1103 | 05-30-2025 01:53 PM | |
| 12286 | 02-22-2024 12:38 PM |
10-31-2017
01:49 PM
The join() function, from the documentation, "may be used only in conjunction with the allAttributes , allMatchingAttributes , and allDelineatedValues functions". I think you want the append() function: temp_${now():format("yyyy-MM-dd-HH-mm-ss"):append(${random():mod(10):plus(1)})} I tested this with my EL tester and it seems to work. However you might want to append a dash or underscore before the random digit, as the above expression will put the digit exactly after the number of seconds, unless that's what you want.
... View more
10-30-2017
06:23 PM
1 Kudo
In later versions of NiFi, you may also consider using the "record-aware" processors and their associated Record Readers/Writers, these were developed to avoid this multiple-split problem as well as the volume of associated provenance generated by each split flow file in the flow.
... View more
10-30-2017
06:20 PM
1 Kudo
According to this, it looks like Teradata doesn't support the DOUBLE data type, and instead supports FLOAT as the JDBC type yet returns a Double object when getObject is called. When the Double is inserted into an Avro record whose field type is "float", you get the above error. This IMO is an issue with Teradata being out of spec with JDBC. Although NiFi could demote a Double into a float field, that is not prudent as there can be data/precision loss or other issues of which the user would never be made aware. A workaround might be to use a DECIMAL column instead of DOUBLE, or to cast the column(s) as @Shu suggested.
... View more
10-30-2017
05:05 PM
I don't currently see a processor that would do this for you, but if you are comfortable with a scripting language such as Groovy, Javascript, or Clojure, you could use ExecuteScript with any of these libraries to infer the character set for an incoming stream. For more information on including these library JAR(s) in your ExecuteScript configuration, see my ExecuteScript Cookbook (part 3) and/or my separate blog post. Since this seems like a good feature to have in NiFi proper, I have created a Jira (NIFI-4550) to add an InferCharacterSet processor.
... View more
10-26-2017
05:04 PM
2 Kudos
"Referencing Components" is done by the framework when you explicitly refer to a Component from another, such as selecting that DBCPConnectionPool from a drop-down list in PutSQL for example. With a scripting processor, the framework does not know what the script is doing, including whether it references a particular component or not. Also DBCPConnectionPool does not log that it executes the Validation Query, that is performed by Apache DBCP "under the hood" when a connection is requested from the pool. What is very weird about your situation is that the Controller Service (DBCPConnectionPool) becomes disabled, that should not be related to whether the connection pool gives back good/bad connections. Is there an error bulletin or something in the logs from the DBCPConnectionPool itself or the framework (not the ExecuteScript)?
... View more
10-25-2017
08:57 PM
6 Kudos
Some NiFi Expression Language (EL) expressions can be fairly complex, or used in a large flow, or both. These can make it difficult to test an EL expression on a running NiFi system. Although an excellent feature of NiFi is being able to adapt the flow while the system is running, it may not be prudent to stop a downstream processor, reroute a connection to something like UpdateAttribute, then list the queue in order to see attributes, content, etc. To make EL testing easier, I wrote a Groovy script called testEL.groovy that uses the same EL library that NiFi does, so all functions present in the specified NiFi version are available to the test tool. The following is the usage: usage: groovy testEL.groovy [options] [expressions]
Options:
-D <attribute=value> set value for given attribute
-help print this message As an example, the following tests an expression that appends "_world" to the "filename" attribute: > groovy testEL.groovy -D filename=hello '${filename:append("_world")}'
hello_world Note that it accepts multiple attribute definitions and multiple expressions, so you can test more than one expression using a single set of attributes: > groovy testEL.groovy -D filename=hello -D size=10 '${filename:append("_world")}' '${filename:prepend("I say "):append(" ${size} times")}'
hello_world
I say hello 10 times In order to attach testelgroovy.txt to this post, I had to add a .txt extension (and it lowercased the name), simply rename it before running the above. Hopefully you find this script helpful, if you try it please let me know how/if it works for you, and as always I welcome any questions, comments and suggestions on how to make things better 🙂 Cheers!
... View more
Labels:
10-25-2017
08:30 PM
On the Scheduling tab, set Run Schedule to something like 30 seconds, then you can start and stop the processor immediately, it will run only once. I think GetFTP might be the better of the two options, unless there's some reason it doesn't work with your system.
... View more
10-25-2017
07:36 PM
Are you saying your FTP server does not support listing the top level directory? It shouldn't matter if you don't have a directory structure, as long as the FTP server itself can respond to the list commands (MLSD and/or NLST I think). Alternatively, iIf you know the filename(s) you want to fetch, you can do one of two things: 1) Use GetFTP rather than ListFTP -> FetchFTP, setting the File Filter Regex to include the files you want (perhaps .* for all) 2) Use GenerateFlowFile in place of ListFTP, setting the "filename" attribute to the file you want to fetch. This runs at the rate scheduled for GenerateFlowFile, and will generate the same filenames over and over, unless you are using Expression Language to set the filename at each execution. Basically FetchFTP needs an incoming connection to provide the "filename" attribute so it knows which file to fetch. GetFTP is kind of a combination of ListFile->FetchFile.
... View more
10-25-2017
07:10 PM
If you are waiting for X number of flow files to be received, you can use something like this (assuming you want 10 flow files): def flowfileList = session.get(10)
if(flowfileList.size() < 10) {
session.rollback()
return
}
// If you get here, you have 10 flowfiles in flowfileList
... View more
10-23-2017
05:06 PM
1 Kudo
You can use the Run Schedule property on the Scheduling tab of the processor to set the interval at which it will be scheduled to run, so for 10k events per second you can set it to "100 nanos".
... View more