Member since
11-16-2015
911
Posts
668
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 706 | 09-30-2025 05:23 AM | |
| 1076 | 06-26-2025 01:21 PM | |
| 932 | 06-19-2025 02:48 PM | |
| 1103 | 05-30-2025 01:53 PM | |
| 12290 | 02-22-2024 12:38 PM |
06-20-2017
07:47 PM
2 Kudos
What version of NiFi/HDF are you using? As of NiFi 1.2.0 / HDF 3.0.0, PutHiveQL can accept multiple statements in one flow file, so if you are currently dealing with one INSERT statement per flow file, try MergeContent to batch them up into a single flow file. This should increase performance, but since Hive is an auto-commit database, PutHiveQL is probably not the best choice for large/fast ingest needs. You may be better off putting the data into HDFS and creating/loading a table from it. For PutHiveStreaming, there is a known issue that can reduce performance, it also was fixed in NiFi 1.2.0 / HDF 3.0.0.
... View more
06-20-2017
07:34 PM
That sounds like a different issue altogether, perhaps you'd get more/better responses if you create a new question in HCC to deal with it. The short answer is: your ControllerService implementation can have a method annotated with @OnEnabled, looks like you're calling yours onConfigured(). That method can take a ConfigurationContext: @OnEnabled
public void onConfigured(final ConfigurationContext context) throws InitializationException {
// Do stuff here
} And you can call getProperty(), getProperties(), etc. on it
... View more
06-20-2017
07:14 PM
In addition to the approach (sqlplus with ExecuteProcess / ExecuteStreamCommand) mentioned in the other post, you could also connect via a scripting language using ExecuteScript, calling Connection.prepareCall() and such as described here. I have an example on how to interact with SQL using Groovy and ExecuteScript here.
... View more
06-20-2017
05:46 PM
Usually (as is the case for the nifi-hadoop-bundle), the NAR depends on the nifi-hadoop-libraries-nar, as it provides the Hadoop libraries (such as the provided dependencies you have in your processor POM like hadoop-common), and its parent NAR is nifi-standard-services-api-nar (which you have in your NAR POM). Currently, NARs can only have one parent, so you wouldn't be able to depend on both the hadoop-libraries and standard-services-api NARs at the same time. Since the former depends on the latter, this works for your processor. Try replacing the NAR POM dependency on nifi-standard-services-api-nar to nifi-hadoop-libraries-nar, this should provide all the classes/JARs/dependencies you need.
... View more
06-20-2017
05:37 PM
1 Kudo
I think you're running into this issue, perhaps try an explicit module unload at the end of your script? That will probably have a performance impact, but if it works, we can file an improvement Jira to look at adding this to the Jython support in NiFi, to unload the modules if the Module Directory property (or the files it points to) change.
... View more
06-20-2017
05:30 PM
I use bytearray() in my examples, but I haven't been able to figure out when you need it and when you don't. I suspect it might be when the type is 'unicode' or 'java.lang.String' instead of Jython's 'str' type. The following two lines worked for me: insertquery = "insert into Tweets_test values ('"+str(obj['id'])+"','"+obj['text']+"','"+str(obj['id_str'])+"');"
outputStream.write(insertquery) This page says that a Jython String will be coerced to byte[] when necessary, and that seems to be what's going on above.
... View more
06-19-2017
07:57 PM
1 Kudo
The following configuration of ReplaceText should work for you: where the Search Value is the following: [\[\]](\{|\}) This matches [{ or ]} (and also [} and ]{ which shouldn't show up if your input is valid JSON) and replaces it with whichever curly brace it found. Note that this is a fairly specific solution, where the array is the last element of an object (i.e. the end pattern is not the end-of-object followed by end-of-array, rather the reverse). A more forgiving solution (for your input JSON) might be to use the following Chain spec in a JoltTransformJSON processor: [
{
"operation": "shift",
"spec": {
"result": {
"curves": {
"*": {
"@": "result.curves"
}
},
"*": "result.&"
},
"*": "&"
}
}
] This "hoists" the object in the 1-element array up one level, achieving the same result as the ReplaceText pattern above.
... View more
06-19-2017
06:24 PM
You can annotate a method (with zero or one arguments, the one being a ProcessContext) with @OnStopped that will get called when the processor is stopped. See the Component Lifecycle section of the Developer Guide for more details.
... View more
06-19-2017
01:03 AM
Is SMPPServerSimulator.run() an asynchronous method, meaning it returns immediately? If so, then the framework may have already committed your session (after returning from the onTrigger() call in AbstractProcessor), in which case the transfer() either didn't work (but it was too late to throw a runtime exception up through to the framework) or it did work but the session now needs another call to commit(). These types of processors that use a separate thread/lifecycle to manage I/O are tricky to integrate into the NiFi architecture, it takes great care to make sure operations are performed in an order that is consistent with the behavior of both the separate entity and the NiFi framework. If the run() method is synchronous, then there is something else going on, but it still seems related to the session not being committed. Can you attach a debugger and see if sesson.commit() gets called after your session.transfer()?
... View more
06-15-2017
10:41 PM
What version of NiFi/HDF are you using? As of NiFi 1.2.0 or HDF 3.0.0, PutHiveQL supports multiple statements (via NIFI-3031) and there is also an EnforceOrder processor (via NIFI-3414), the latter of which could be configured to use the fragment.index attribute for the Order Attribute property and ${fragment.identifier} for the Group Identifier property. Prior to NiFi 1.2.0, you can try adding an UpdateAttribute processor between SplitContent and PutHiveQL, setting the "priority" attribute to ${fragment.index}. Then use a PriorityAttributePrioritizer on the connections between the SplitContent -> UpdateAttribute -> PutHiveQL. I'm not sure if this works as-is because the documentation suggests the priority comparator is lexicographical and not numeric. If that's the case, you'd need some Expression Language functions or an ExecuteScript processor to left-pad the fragment.index values with zeros to make them all the same length.
... View more