Member since
11-16-2015
892
Posts
649
Kudos Received
245
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5063 | 02-22-2024 12:38 PM | |
1326 | 02-02-2023 07:07 AM | |
2974 | 12-07-2021 09:19 AM | |
4137 | 03-20-2020 12:34 PM | |
13870 | 01-27-2020 07:57 AM |
05-23-2023
09:15 AM
As far as I know, InferAvroSchema is not a supported processor. However there are record-based processors and an AvroReader controller service you can use with those processors. AvroReader has a (default) option to Infer Schema. This should achieve the same effect, and record-based processors are often more performant.
... View more
03-16-2023
10:34 AM
In Oracle an UPSERT is done by a MERGE, so alternatively you could store your data in a new temporary table and then run ExecuteSQL/PutSQL with a MERGE command to merge from the temp table into the target table.
... View more
02-02-2023
07:07 AM
I'm not a Hive expert but I did author the original PutHive3Streaming processor for NiFi. My recommendation is setting Records Per Transaction greater than the number of records in a FlowFile (unless we are talking about super-huge files), and transactions per batch to 1. This makes the transaction semantics similar to how NiFi FlowFile sessions work (rollback, failure, success, e.g.). If the number of records is huge and is causing throughput problems, try dividing that number by 100 and making transactions per batch 100. When you multiply the two numbers together it should be greater than the total number of records in the FlowFile in order to avoid overhead with the Hive Metastore by requesting a large number of batches/transactions.
... View more
12-20-2022
05:52 AM
I wasn't able to reproduce this, I remember trying your example and the UPSERT worked for me, so I'm not sure what's going on
... View more
11-02-2022
05:37 AM
1 Kudo
Agreed, you do not have access to the fields in either the incoming or outgoing JSON objects using Expression Language in the spec.
... View more
10-05-2022
08:31 AM
I believe the type checking for logical types is more strict now as of https://issues.apache.org/jira/browse/AVRO-2493 and NiFi 1.17.0 (when we upgraded to Avro 1.11.1). Are you using "int" or "string" as the normal Avro type? According the spec (https://avro.apache.org/docs/1.11.1/specification/#timestamp-millisecond-precision) it must be "long".
... View more
12-07-2021
09:19 AM
The operation to add an attribute to a FlowFile is on the ProcessSession object not the FlowFile itself (so the session can keep track of changes). Try the following instead: session.putAttribute(destFlowFile, , "logMsg", "Testing Msg") session.putAllAttributes(destFlowFile, backupAttributes)
... View more
03-24-2021
03:23 PM
1 Kudo
What are the column names in your table? Assuming "carId" and "carType", you can use JoltTransformJson or JoltTransformRecord with the following spec: [ { "operation": "shift", "spec": { "*": { "$": "carId", "@": "carType" } } }, { "operation": "shift", "spec": { "carId": { "*": { "@": "[&0].carId" } }, "carType": { "*": { "@": "[&0].carType" } } } } ]
... View more
02-09-2021
02:12 PM
1 Kudo
If you use GrokReader you can use the same kv filter from logstash: https://community.cloudera.com/t5/Support-Questions/Grok-Patterns-Expressions-for-capturing-comma-separated-key/td-p/311126
... View more
01-29-2021
04:54 PM
Is there anything in the logs before/after the "already marked for transfer" entry? Trying to figure out how a flow file can get transferred and then something goes wrong (where we'd try to also send it to failure)
... View more