Member since
11-16-2015
902
Posts
664
Kudos Received
249
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
114 | 09-30-2025 05:23 AM | |
578 | 06-26-2025 01:21 PM | |
423 | 06-19-2025 02:48 PM | |
672 | 05-30-2025 01:53 PM | |
9632 | 02-22-2024 12:38 PM |
09-29-2016
01:51 PM
1 Kudo
session.create() will create a new flow file, it won't use an incoming one. For that you will want session.get(), which returns a flow file (or None). If you require an input flow file, be sure to check for None and only continue processing if the flow file != None. There is an example of this on my blog (same site as above but different post).
... View more
09-26-2016
06:19 PM
1 Kudo
You could use ListFile in a separate flow, it will keep track of the files it has listed so far, such that if your ExecuteStreamCommand generates more files in the specified location(s), only the new files will be listed the next time ListFile runs. Then ListFile can be routed to FetchFile to get the contents of the new files, etc.
... View more
09-22-2016
08:31 PM
3 Kudos
The Hive JDBC driver is included with the Hive processors. It appears your driver URL has "!connect" at the front when it should instead start with the "jdbc:hive2" prefix, removing that should fix the issue.
... View more
09-22-2016
07:25 PM
1 Kudo
I'm working on a processor to do this kind of thing: https://issues.apache.org/jira/browse/NIFI-2735
... View more
09-22-2016
05:29 PM
2 Kudos
The RouteOnAttribute processor is what you're looking for, you can match on an attribute value (for example), and only route to a "matched" relationship; self-terminating the "unmatched" relationship would cause the FlowFile to be discarded/ignored. Also if you want to do any error handling you could route unmatched flow files to another processor to log or otherwise handle them.
... View more
09-22-2016
02:55 PM
3 Kudos
There are a couple of options: 1) If you want one SQL query per parameter, you can use ListFile/FetchFile (or GetFile if you want to repeatedly get the config file) to retrieve the configuration file, then SplitText to split each line (so one parameter per flow file), then ExtractText to get the name and value of the parameter, then ReplaceText to build a SQL query using Expression Language and the name of the parameter (which will fill in the value), such as the example statement you have above. 2) If you want to build a single statement with possibly multiple parameters, you could use ExecuteScript (if you are comfortable writing code in Groovy, Jython, JRuby, JavaScript, or Lua) to read in the configuration file, split the lines to build a map of parameter names to values, then write out a SQL statement with the names and/or values as you have done above.
... View more
09-19-2016
09:55 PM
1 Kudo
This is a known issue with the version of Hive (1.2.1) currently packaged with NiFi: https://issues.apache.org/jira/browse/NIFI-2575 caused by: https://issues.apache.org/jira/browse/HIVE-11581 The workaround is to not use Zookeeper for the service discovery.
... View more
09-09-2016
07:21 PM
The PutElasticsearch processor uses the Transport API for Elasticsearch, not the HTTP API. This means your port should be 9300 not 9200. The "Identifier Attribute" property is the name of a flow file attribute that contains a unique identifer for the document. If you don't have an identifier you want to use, you can put "uuid", this will use the flow file's UUID as the identifier for the Elasticsearch document. If you do have an identifier for the document, put its value (using UpdateAttribute, EvaluateJsonPath, etc.) into a flow file attribute, and put that attribute's name in the "Identifier Attribute" property. Note you don't use Expression language here, so if your attribute's name is "doc_id", you put "doc_id" in the Identifier Attribute property, not "${doc_id}"
... View more
09-06-2016
12:47 AM
2 Kudos
Yes, you can use ListFile -> FetchFile. ListFile will keep track of which files it has read, so it will only list those files that it has not seen before. FetchFile will get the contents of the files passed in. Together they work like GetFile except ListFile keeps track of the files it has read.
... View more
09-01-2016
09:50 PM
Here's the example for "prefix soup" (a kind of flattening): Input: {
"Rating": 1,
"SecondaryRatings": {
"Design": 4,
"Price": 2,
"RatingDimension3": 1
}
} Spec: [
{
"operation": "shift",
"spec": {
"Rating": "rating-primary",
//
// Turn all the SecondaryRatings into prefixed data
// like "rating-Design" : 4
"SecondaryRatings": {
// the "&" in "rating-&" means go up the tree 0 levels,
// grab what is ther and subtitute it in
"*": "rating-&"
}
}
}
]
... View more