Member since
11-16-2015
905
Posts
665
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 364 | 09-30-2025 05:23 AM | |
| 709 | 06-26-2025 01:21 PM | |
| 585 | 06-19-2025 02:48 PM | |
| 813 | 05-30-2025 01:53 PM | |
| 11215 | 02-22-2024 12:38 PM |
07-28-2017
01:03 AM
What are some sample values for those parameters? Could they have spaces in them? Perhaps try putting quotes around each of the arguments like "${to}"?
... View more
07-24-2017
08:38 PM
Try three slashes in the Database Driver Jar Url property: file:///post/postgresql-42.1.1.jar
... View more
07-11-2017
03:18 PM
1 Kudo
Koji is suggesting the use of GrokReader in a record-aware processor (such as QueryRecord or PartitionRecord), rather than the ExtractGrok processor. With a GrokReader, you can do your split using SQL (with QueryRecord), perhaps something like: SELECT * FROM FLOWFILE WHERE tstamp < ${now():toNumber():minus(1000)} and SELECT * FROM FLOWFILE WHERE tstamp >= ${now():toNumber():minus(1000)} to route the lines whether the timestamp (in a "tstamp" field) was before a second ago. Alternatively you can use PartitionRecord to group records into individual flow files, with each flow file containing the records that have the same values for the specified fields.
... View more
06-30-2017
08:04 PM
SplitText for some reason starts the index at 1, the other Split processors start at 0. Sorry I had forgotten that difference, good catch!
... View more
06-30-2017
05:11 PM
@Alvin Jin To answer your question about which processors to use: it depends on what you want to do with the whole CSV file. Your question only mentions splitting and ignoring the header, the CSVReader takes care of that. The record-aware processors in NiFi 1.3.0 include: ConsumeKafkaRecord_0_10: Gets messages from a Kafka topic, bundles into a single flow file instead of one per message ConvertRecord: Converts records from one data format to another (Avro to JSON, e.g.) LookupRecord: Uses fields from a record to lookup a value, which can be added back to the record PartitionRecord: Groups "like" records (based on user-provided criteria) into individual flow files PublishKafkaRecord_0_10: Posts messages to a Kafka topic PutDatabaseRecord: Executes a specified operation (INSERT, UPDATE, DELETE, e.g.) on a database for each record in a flow file PutElasticsearchHttpRecord: Executes a specified operation ("index", e.g.) on an Elasticsearch cluster for each record in a flow file QueryRecord: execute SQL queries on fields from the records. This can be used to filter, aggregate, etc. SplitRecord: Splits records into smaller flow files. Usually only used when downstream processors are not record-aware UpdateRecord: Updates field(s) in each record of a flow file Also I wanted to mention, if for some reason all your CSV columns are strings, you can set "Schema Access Strategy to "Use String Fields From Header", and then you don't need a schema or schema registry. Otherwise if you want to provide a schema, you're not required to use a schema registry, you can just paste your schema into the Schema Text property. and set "Schema Access Strategy" to "Use Schema Text Property".
... View more
06-29-2017
06:54 PM
In addition to @Wynner's answer, if you'd like to keep using ExecuteScript, you can pass in arguments as user-defined properties (aka dynamic properties) or flow file attributes and use them in ExecuteScript. For examples on leveraging user-defined properties in ExecuteScript, check out Part 3 of my ExecuteScript Cookbook article series in HCC, it has examples in Jython.
... View more
06-28-2017
08:32 PM
1 Kudo
You could set the Header Line Count to 0, then send the flowfiles to a RouteOnAttribute processor where you can "skip" the first line by routing on the following Expression Language statement: ${fragment.index:gt(0)} The first line will be routed to "unmatched" and the rest to "matched" or the user-defined property name (depending on the value of the Routing Strategy property). Note that this requires the Line Split Count property be set to 1 in SplitText. Alternatively, if you are using (or can upgrade to) NiFi 1.3.0, you can use a record-aware processor with a CSVReader. This reader can be configured to (among other things) skip the header line. The record-aware processors also offer better performance when working with flow files that contain many "records" (such as a CSV file where each "record" is a row).
... View more
06-28-2017
08:16 PM
4 Kudos
As of NiFi 1.3.0, you can use UpdateRecord for this. If your incoming field name is "createdOn", you can add a user-defined property named "/createdOn" whose value is the following: ${field.value:toDate('yyyy-mm-dd HH:mm:ss.SSS'):toNumber()} Note that you may need to change the type of createdOn from String (in the Reader's schema) to Long (in the Writer's schema).
... View more
06-23-2017
05:23 PM
You can try a thread dump (with jstack or nifi.sh dump) while it is waiting to shut down, you may be able to spot the culprit in the output.
... View more
06-21-2017
06:24 PM
I tested this with Arabic characters in my text field, and it worked fine. You're saying you still get the error when using my suggested lines?
... View more