About mburgess

mburgess · ‎12-22-2017

Rather than GetFile, you could use ListenHttp -> FetchFile -> PutHDFS. Then you can POST to your configured endpoint with a header set to the filename, and in ListenHttp set that header name as the value of the "HTTP Headers to receive as Attributes (Regex)" property. Then FetchFile would be configured to use that attribute to fetch the file and pass to PutHDFS.

mburgess · ‎12-14-2017

This is a known "feature" of the DB2 driver, it closes the result set after the last row is retrieved, but our code doesn't know whether the last row has been retrieved or not, which is why we call ResultSet.next() to see if there is anything there. According to this, you should be able to add "allowNextOnExhaustedResultSet=1" either to your JDBC URL or as a user-defined property in your DBCPConnectionPool (via NIFI-3426, assuming you have NiFi 1.2.0 or later), then ResultSet.next() should not throw an exception and instead should finish successfully.

mburgess · ‎12-13-2017

As of NiFi 1.2.0 (via NIFI-3658), you can use ConvertRecord to achieve this. You would configure a CSVReader to match your CSV format (comma-delimited, possibly other config options around header line, quotes, etc.) and a JSONRecordSetWriter to output JSON. Prior to NiFi 1.2.0, you could use ConvertCSVToAvro -> ConvertAvroToJSON.

mburgess · ‎12-13-2017

Can you share an example or two of incoming JSON data, your config for EvaluateJSONPath, and an example of the flow file after MergeContent (perhaps setting number of entries much lower to fit here)?

mburgess · ‎12-09-2017

This is great thanks! Also, before too long there should be a NiFi processor and controller service to help with some of the session management (NIFI-4683).

mburgess · ‎12-09-2017

You could consider a non-materialized view that does all the joins. Also there is a Jira case (NIFI-1706) and GitHub Pull Request to allow custom queries with QueryDatabaseTable.

mburgess · ‎12-09-2017

What version of NiFi are you using? Check NIFI-3000 for a history of what's been done and what hasn't been done. Depending on your version, you will likely want to switch to record-aware processors such as ConvertRecord, as they support logical types as of NIFI-2624 where some other processors may not. You may also be able to leverage PartitionRecord to help with grouping the same values, or QueryRecord (with LIMIT 1 perhaps) to help with duplicate detection/elimination.

mburgess · ‎12-06-2017

It is highly recommended that you don't place JARs in NiFi's lib/ folder; as you can see it can affect the behavior of the whole instance. Instead you can put all the Hive driver JAR(s) into some other folder, and add the path to that folder to the Database Driver Location(s) property of the DBCPConnectionPool. There is also a HiveConnectionPool controller service, which is similar to DBCPConnectionPool but it comes packaged with a Hive 1.2 driver. Depending on your Hive version, you should be able to use the built-in driver via HiveConnectionPool. If your Hive server is from HDP 2.4+, you will want to use the HDF version of NiFi, as that has the HDP Hive driver packaged within, versus the Apache Hive driver.

mburgess · ‎12-02-2017

What version of Hortonworks Data Flow NiFi are you using? As of Apache NiFi 1.1.0 (I'm not sure offhand which HDF NiFi release that corresponds to, possibly HDF 3.0), the Elasticsearch 5 processors are available. Having said that, the ElasticsearchHttp processors (PutElasticsearchHttp in your screenshot above, e.g.) use the REST API rather than the Transport Client, so those processors are much more robust in the sense of which Elasticsearch versions they support. I believe the ElasticsearchHttp processors support at least 2.X through at least 5.0.1 (but I think they might work for all of 5.X). They are not quite as performant as the transport client versions (PutElasticsearch and PutElasticsearch5), but for your version of NiFi PutElasticsearchHttp should allow you to connect to your ES5 cluster.

mburgess · ‎12-02-2017

In this case, you have two different schemas, one for the input and one for the output. The one you list above is the output schema, but you will also need an input schema such as: { "type": "record", "name": "employeeInfo", "fields": [ { "name": "ID", "type": "long", "default": 0 }, { "name": "Name", "type": "string", "default": "defaultName" }, { "name": "Age", "type": ["null", "long"], "default": 0 }, { "name": "JoinedOn", "type": "string", "default": "defaultDate" } ] } Because your input and output schemas are not the same, ConvertRecord is not the right choice for this operation; the doc says "The Reader and Writer must be configured with "matching" schemas. By this, we mean the schemas must have the same field names". I believe you will need two UpdateRecords in a row: 1) Move the employee info from the top-level into the array (using a Replacement Value Strategy of "Record Path Value"), with an example user-defined property for the ID field having property name "/columns/ID" and value "/ID". 2) I'm not sure where you are getting your values for timestamp, database, and table_name, but you can probably set those with another UpdateRecord, this time with a Replacement Value Strategy of "Literal Value", possibly using Expression Language to get the values from flow file attributes ("/table_name" = "${table.name}" for example), or functions ("/timestamp" = "${now():toNumber()}" for example).

Online	Offline
Last Visited	‎10-29-2025 03:45 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎10-29-2025 03:45 PM
Posts	905
Kudos received	658

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: Nifi : Pass Parameter dynamically (ex . file p...

Re: Connecting Apache NiFi and Querying tables to ...

Re: How to convert a ',' comma separated file/text...

Re: NiFi: JSON to CSV to Hive

Re: How to Submit Spark Application through Livy R...

Re: Is it possible to perform an incremental Impor...

Re: Issues with reading Avro logicalType decimal u...

Re: Connecting to Hive from NiFi

Re: How to add a putelaticsearch5 processor in nif...

Re: Avro Schema for converting to JsonArray return...