Member since
11-16-2015
905
Posts
665
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 423 | 09-30-2025 05:23 AM | |
| 744 | 06-26-2025 01:21 PM | |
| 637 | 06-19-2025 02:48 PM | |
| 841 | 05-30-2025 01:53 PM | |
| 11348 | 02-22-2024 12:38 PM |
12-22-2017
03:30 PM
1 Kudo
Rather than GetFile, you could use ListenHttp -> FetchFile -> PutHDFS. Then you can POST to your configured endpoint with a header set to the filename, and in ListenHttp set that header name as the value of the "HTTP Headers to receive as Attributes (Regex)" property. Then FetchFile would be configured to use that attribute to fetch the file and pass to PutHDFS.
... View more
12-14-2017
11:04 PM
1 Kudo
This is a known "feature" of the DB2 driver, it closes the result set after the last row is retrieved, but our code doesn't know whether the last row has been retrieved or not, which is why we call ResultSet.next() to see if there is anything there. According to this, you should be able to add "allowNextOnExhaustedResultSet=1" either to your JDBC URL or as a user-defined property in your DBCPConnectionPool (via NIFI-3426, assuming you have NiFi 1.2.0 or later), then ResultSet.next() should not throw an exception and instead should finish successfully.
... View more
12-13-2017
03:05 PM
1 Kudo
As of NiFi 1.2.0 (via NIFI-3658), you can use ConvertRecord to achieve this. You would configure a CSVReader to match your CSV format (comma-delimited, possibly other config options around header line, quotes, etc.) and a JSONRecordSetWriter to output JSON. Prior to NiFi 1.2.0, you could use ConvertCSVToAvro -> ConvertAvroToJSON.
... View more
12-13-2017
01:00 PM
Can you share an example or two of incoming JSON data, your config for EvaluateJSONPath, and an example of the flow file after MergeContent (perhaps setting number of entries much lower to fit here)?
... View more
12-09-2017
06:21 PM
This is great thanks! Also, before too long there should be a NiFi processor and controller service to help with some of the session management (NIFI-4683).
... View more
12-09-2017
02:28 AM
You could consider a non-materialized view that does all the joins. Also there is a Jira case (NIFI-1706) and GitHub Pull Request to allow custom queries with QueryDatabaseTable.
... View more
12-09-2017
02:21 AM
1 Kudo
What version of NiFi are you using? Check NIFI-3000 for a history of what's been done and what hasn't been done. Depending on your version, you will likely want to switch to record-aware processors such as ConvertRecord, as they support logical types as of NIFI-2624 where some other processors may not. You may also be able to leverage PartitionRecord to help with grouping the same values, or QueryRecord (with LIMIT 1 perhaps) to help with duplicate detection/elimination.
... View more
12-06-2017
06:05 PM
It is highly recommended that you don't place JARs in NiFi's lib/ folder; as you can see it can affect the behavior of the whole instance. Instead you can put all the Hive driver JAR(s) into some other folder, and add the path to that folder to the Database Driver Location(s) property of the DBCPConnectionPool. There is also a HiveConnectionPool controller service, which is similar to DBCPConnectionPool but it comes packaged with a Hive 1.2 driver. Depending on your Hive version, you should be able to use the built-in driver via HiveConnectionPool. If your Hive server is from HDP 2.4+, you will want to use the HDF version of NiFi, as that has the HDP Hive driver packaged within, versus the Apache Hive driver.
... View more
12-02-2017
03:50 PM
1 Kudo
What version of Hortonworks Data Flow NiFi are you using? As of Apache NiFi 1.1.0 (I'm not sure offhand which HDF NiFi release that corresponds to, possibly HDF 3.0), the Elasticsearch 5 processors are available. Having said that, the ElasticsearchHttp processors (PutElasticsearchHttp in your screenshot above, e.g.) use the REST API rather than the Transport Client, so those processors are much more robust in the sense of which Elasticsearch versions they support. I believe the ElasticsearchHttp processors support at least 2.X through at least 5.0.1 (but I think they might work for all of 5.X). They are not quite as performant as the transport client versions (PutElasticsearch and PutElasticsearch5), but for your version of NiFi PutElasticsearchHttp should allow you to connect to your ES5 cluster.
... View more
12-02-2017
03:42 PM
In this case, you have two different schemas, one for the input and one for the output. The one you list above is the output schema, but you will also need an input schema such as: {
"type": "record",
"name": "employeeInfo",
"fields": [
{ "name": "ID", "type": "long", "default": 0 },
{ "name": "Name", "type": "string", "default": "defaultName" },
{ "name": "Age", "type": ["null", "long"], "default": 0 },
{ "name": "JoinedOn", "type": "string", "default": "defaultDate" }
]
} Because your input and output schemas are not the same, ConvertRecord is not the right choice for this operation; the doc says "The Reader and Writer must be configured with "matching" schemas. By this, we mean the schemas must have the same field names". I believe you will need two UpdateRecords in a row: 1) Move the employee info from the top-level into the array (using a Replacement Value Strategy of "Record Path Value"), with an example user-defined property for the ID field having property name "/columns/ID" and value "/ID". 2) I'm not sure where you are getting your values for timestamp, database, and table_name, but you can probably set those with another UpdateRecord, this time with a Replacement Value Strategy of "Literal Value", possibly using Expression Language to get the values from flow file attributes ("/table_name" = "${table.name}" for example), or functions ("/timestamp" = "${now():toNumber()}" for example).
... View more