Member since
11-16-2015
911
Posts
668
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 703 | 09-30-2025 05:23 AM | |
| 1076 | 06-26-2025 01:21 PM | |
| 931 | 06-19-2025 02:48 PM | |
| 1102 | 05-30-2025 01:53 PM | |
| 12283 | 02-22-2024 12:38 PM |
11-28-2017
06:42 PM
What version of NiFi are you using? Is the "value" column in your database table a String or a Float/Double? What processor(s) are you using to read from the database? If using ExecuteSQL, could you do something like the following? SELECT metric, CAST(value AS DOUBLE) AS value, timestamp, tags from myTable Alternatively, as of NiFi 1.2.0 (HDF 3.0) you can use the JoltTransformJSON processor to do type conversion (see an example here). Also if you know what the schema is supposed to be, you could use ConvertRecord with a JsonRecordSetWriter which is associated with the "correct" schema. The reader can be an AvroReader which uses the Embedded Schema.
... View more
11-28-2017
06:37 PM
Is it possible to share your nifi-app.log on this question? Also, does this driver work from other utilities (Squirrel SQL, e.g.)?
... View more
11-27-2017
04:21 PM
Is there anything else underneath that stack trace in nifi-app.log? There is usually a Caused By with a ClassNotFoundException or something like that.
... View more
11-16-2017
04:53 PM
You can do it without a schema registry, if your readers and writers "Use 'Schema Text' Property" and you hardcode the schema into the Schema Text property. Since you're using the same for both reader and writer, it's easier to maintain in a registry, but only a simple copy-paste if you don't want to use the registry.
... View more
11-16-2017
02:49 PM
3 Kudos
It appears you want to set the destination path to the value of type, followed by the value of id, followed by data.txt, and in the content of that file you want the single-element JSON array containing the object that provided the values. If that is the case: As of NiFi 1.3.0, there is a PartitionRecord processor which will do most of what you want. You can create a JsonReader using the following example schema: {"type":"record","name":"test","namespace":"nifi",
"fields": [
{"name":"type","type":"string"},
{"name":"id","type":"string"},
{"name":"content","type":"string"}
]
} You can also create a JsonRecordSetWriter that inherits the schema (as of NiFi 1.4.0) or uses the same one (prior to NiFi 1.4.0). Then in PartitionRecord you would create two user-defined properties, say record.type and record.id, configured as follows: Given your example data, you will get 4 flow files, each containing the data from the 4 groups you mention above. Additionally you have record.type and record.id attributes on those flow files. You can route them to UpdateAttribute where you set filename to data.txt and absolute.path to /${type}/${id}. Then you can send them to PutHDFS where you set the Directory to ${absolute.path}.
... View more
11-16-2017
02:17 PM
1 Kudo
Thanks very much! I hope to write another series for InvokeScriptedProcessor, ScriptedReportingTask, ScriptedReader, and ScriptedRecordSetWriter someday 🙂
... View more
11-14-2017
11:36 PM
You don't need your own sys.path.append calls, you can just put the directories in a comma-separated list in the Module Directory property of ExecuteScript, and it will call sys.path.append for you. However, because it is Jython, if any of the imports (or any of their dependencies) are native CPython modules, then you won't be able to use them in ExecuteScript. All scripts and modules (and dependencies) must be pure Python. For your exact error, I'd have to see the script (where is "module" defined?), but I suspect that one of these libraries is not pure Python.
... View more
11-14-2017
11:32 PM
Also, depending on what your stored procedure looks like, you may be able to use ExecuteSQL or PutSQL. However they do not support setting output parameters, and I'm not sure if they support input parameters. But if your procedure is hard-coded, then if it returns a ResultSet then ExecuteSQL should work, and if it doesn't, then PutSQL should work. Otherwise the above answer is the best bet.
... View more
11-14-2017
02:06 PM
1 Kudo
I'm not familiar with the innards of either Groovy or Jython, but I am guessing that Jython is slower for the following reasons: 1) Groovy was built "for the JVM" and leverages/integrates with Java more cleanly 2) Jython is an implementation of Python for the JVM. Looking briefly at the code, it appears to go back and forth between the Java and Python idioms, so it is more "emulated" than Groovy. 3) Apache Groovy has a large, very active community that consistently works to improve the performance of the code, both compiled and interpreted. In my own experience, Groovy and Javascript (Nashorn) perform much better in the scripted processors than Jython or JRuby. If you choose Jython, there are still a couple of things you can do to improve performance: - Use InvokeScriptedProcessor (ISP) instead of ExecuteScript. ISP is faster because it only loads the script once, then invokes methods on it, rather than ExecuteScript which evaluates the script each time. I have an ISP template in Jython which should make porting your ExecuteScript code easier. - Use ExecuteStreamCommand with command-line Python instead. You won't have the flexibility of accessing attributes, processor state, etc. but if you're just transforming content you should find ExecuteStreamCommand with Python faster. - No matter which language you choose, you can often improve performance if you use session.get(int) instead of session.get(). That way if there are a lot of flow files in the queue, you could call session.get(1000) or something, and process up to 1000 flow files per execution. If your script has a lot of overhead, you may find handling multiple flow files per execution can significantly improve performance.
... View more
11-13-2017
03:23 PM
What do you mean by "add to this JSON a file that I get from FetchFTP"? Is the file you're fetching a JSON file, and you want to add fields to it? Are you Base64 encoding just the JSON from the attributes or the entire file after adding to it? If the incoming file (from FTP) is JSON, and you can get your attributes added to that flow file, then (as of NiFi 1.2.0 / HDF 3.0) you can use JoltTransformJSON to inject your individual attributes as fields into your JSON document (instead of AttributesToJSON). If you have too many attributes for that, your options are a bit more limited. In NiFi 1.3.0, you can use UpdateRecord to add the JSON from an attribute into a field in the other JSON document. You can also do this manually with ReplaceText. However one of the two JSON objects must be in an attribute. Whichever of the two (from AttributesToJSON or FetchFTP) is smaller, you can get that object first and use ExtractText to put the whole thing into an attribute. Note that attributes have limited size and introduce more memory usage, so beware of large JSON objects in attributes. However if one of them fits in a attribute, then you can use the UpdateRecord or ReplaceText processor as described. If you need to just encode one of the JSON objects, then if it is in an attribute you can use UpdateAttribute with the base64Encode Expression Language function, or if it is in content you can use the Base64EncodeContent processor.
... View more