About mburgess

mburgess · ‎09-26-2016

You could use ListFile in a separate flow, it will keep track of the files it has listed so far, such that if your ExecuteStreamCommand generates more files in the specified location(s), only the new files will be listed the next time ListFile runs. Then ListFile can be routed to FetchFile to get the contents of the new files, etc.

mburgess · ‎09-22-2016

The Hive JDBC driver is included with the Hive processors. It appears your driver URL has "!connect" at the front when it should instead start with the "jdbc:hive2" prefix, removing that should fix the issue.

mburgess · ‎09-22-2016

I'm working on a processor to do this kind of thing: https://issues.apache.org/jira/browse/NIFI-2735

mburgess · ‎09-22-2016

The RouteOnAttribute processor is what you're looking for, you can match on an attribute value (for example), and only route to a "matched" relationship; self-terminating the "unmatched" relationship would cause the FlowFile to be discarded/ignored. Also if you want to do any error handling you could route unmatched flow files to another processor to log or otherwise handle them.

mburgess · ‎09-22-2016

There are a couple of options: 1) If you want one SQL query per parameter, you can use ListFile/FetchFile (or GetFile if you want to repeatedly get the config file) to retrieve the configuration file, then SplitText to split each line (so one parameter per flow file), then ExtractText to get the name and value of the parameter, then ReplaceText to build a SQL query using Expression Language and the name of the parameter (which will fill in the value), such as the example statement you have above. 2) If you want to build a single statement with possibly multiple parameters, you could use ExecuteScript (if you are comfortable writing code in Groovy, Jython, JRuby, JavaScript, or Lua) to read in the configuration file, split the lines to build a map of parameter names to values, then write out a SQL statement with the names and/or values as you have done above.

mburgess · ‎09-19-2016

This is a known issue with the version of Hive (1.2.1) currently packaged with NiFi: https://issues.apache.org/jira/browse/NIFI-2575 caused by: https://issues.apache.org/jira/browse/HIVE-11581 The workaround is to not use Zookeeper for the service discovery.

mburgess · ‎09-09-2016

The PutElasticsearch processor uses the Transport API for Elasticsearch, not the HTTP API. This means your port should be 9300 not 9200. The "Identifier Attribute" property is the name of a flow file attribute that contains a unique identifer for the document. If you don't have an identifier you want to use, you can put "uuid", this will use the flow file's UUID as the identifier for the Elasticsearch document. If you do have an identifier for the document, put its value (using UpdateAttribute, EvaluateJsonPath, etc.) into a flow file attribute, and put that attribute's name in the "Identifier Attribute" property. Note you don't use Expression language here, so if your attribute's name is "doc_id", you put "doc_id" in the Identifier Attribute property, not "${doc_id}"

mburgess · ‎09-06-2016

Yes, you can use ListFile -> FetchFile. ListFile will keep track of which files it has read, so it will only list those files that it has not seen before. FetchFile will get the contents of the files passed in. Together they work like GetFile except ListFile keeps track of the files it has read.

mburgess · ‎09-01-2016

Here's the example for "prefix soup" (a kind of flattening): Input: { "Rating": 1, "SecondaryRatings": { "Design": 4, "Price": 2, "RatingDimension3": 1 } } Spec: [ { "operation": "shift", "spec": { "Rating": "rating-primary", // // Turn all the SecondaryRatings into prefixed data // like "rating-Design" : 4 "SecondaryRatings": { // the "&" in "rating-&" means go up the tree 0 levels, // grab what is ther and subtitute it in "*": "rating-&" } } } ]

mburgess · ‎09-01-2016

If you click on any of the examples,it should fill in the Input and Spec boxes.

Online	Offline
Last Visited	‎12-03-2025 12:10 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎12-03-2025 12:10 PM
Posts	911
Kudos received	662

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: NiFi: How to connect to a GetFile Processor?

Re: NiFi Cannot create JDBC driver of class 'org.a...

Re: Apache Nifi - How to calculate SUM or AVERAGE ...

Re: How to ignore/filter flow-files on the base of...

Re: Use NIFI to aggregate data in SQL

Re: SelectHiveQL Fails on java.lang.NullPointerExc...

Re: How to insert data in elastic search using Apa...

Re: How to avoid NiFi GetFile recoping of file whe...

Re: Can we flatten complex JSON file using NIFI.?

Re: Can we flatten complex JSON file using NIFI.?