Member since
11-16-2015
892
Posts
650
Kudos Received
245
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5672 | 02-22-2024 12:38 PM | |
1389 | 02-02-2023 07:07 AM | |
3091 | 12-07-2021 09:19 AM | |
4208 | 03-20-2020 12:34 PM | |
14168 | 01-27-2020 07:57 AM |
07-27-2018
04:51 AM
Thanks a lot @Matt Burgess for the details on the current limitations and the Jira. I would look at the solution provided by you on the other thread. Appreciate your help on the same. Regards, Vish
... View more
07-27-2018
01:40 PM
sorry for my ignorance. I could resolve this issue by removing one additional space in " list" while wring the flowfile content generated by groovy script. Now I am able to fetch the data from invoke http by passing flowfile content from executescript. Thanks for the support. Regards, Vish
... View more
07-30-2018
06:26 PM
ValidateRecord is more about validating the individual records than it is about validating the entire flow file. If some records are valid and some are invalid, each type will be routed to the corresponding relationship. However, for invalid records, we can't use the same record writer as valid records, or else we know it will fail (because we know they're invalid), so there is a second RecordWriter for invalid records (you might use this to try to record the field names or something, but by the time that ValidateRecord knows the individual record is invalid, it doesn't know that it came in as Avro (for example), nor does it know that you might want it to go out as Avro. That's the flexibility and power of the Record Reader/Writer paradigm, but in this case the tradeoff is that you can't currently treat the entire flow file as valid or invalid. It may make sense to have a "Invalid Record Strategy" property, to choose between "Individual Records" using the RecordWriters (the current behavior), or "Original FlowFile" which would ignore the RecordWriters and instead transfer the entire incoming flow file as-is to the 'invalid' relationship. Please feel free to file an improvement Jira for this capability.
... View more
07-23-2018
12:49 PM
1 Kudo
You can use a JOIN clause in the select statement, but it will only work for a single RDBMS. You may find you can join two tables from two databases/schemas in the same RDBMS (if that system lets you), but you can't currently join two tables from totally separate database systems. You could investigate Presto, it allows for joining of tables across multiple systems, and you could have a single connection to it from NiFi in ExecuteSQL. That way it will look like a single RDBMS to NiFi, but Presto can be configured to do the cross-DB join.
... View more
07-15-2018
04:38 PM
If you use large attributes, you will have serious issue with the "snapshot" file in the flow content repository. I've just killed my PROD this way last week : the snapshot was too big too fit in memory at startup : my data was lost.
... View more
05-29-2019
03:18 PM
How to split complexed json arrays into individual json objects with SplitJson processor in NIFI? I don't know how to configure the relationship original, split, failure. Json arrays is below { "scrollId1": "xyz", "data": [ { "id": "app-server-dev-glacier", "uuid": "a0733c21-6044-11e9-9129-9b2681a9a063", "name": "app-server-dev-glacier", "type": "archiveStorage", "provider": "aws", "region": "ap-southeast-1", "account": "164110977718" }, { "id": "abc.company.archive.mboi", "uuid": "95100b11-6044-11e9-977a-f5446bd21d81", "name": "abc.company.archive.mboi", "type": "archiveStorage", "provider": "aws", "region": "us-east-1", "account": "852631421774" } ] } I need to split it into { "id": "app-server-dev-glacier", "uuid": "a0733c21-6044-11e9-9129-9b2681a9a063", "name": "app-server-dev-glacier", "type": "archiveStorage", "provider": "aws", "region": "ap-southeast-1", "account": "164110977718" }, { "id": "abc.company.archive.mboi", "uuid": "95100b11-6044-11e9-977a-f5446bd21d81", "name": "abc.company.archive.mboi", "type": "archiveStorage", "provider": "aws", "region": "us-east-1", "account": "852631421774" } Next, I need to insert another field "time" in front of "id", the first attribute of individual object. I used SplitJson processor, and JSON Path Expression is $.data.id.*, but the relationship reports error. Don't know how to config relationship branches, original, split and failure. Any one have any advice? @Shu
... View more
07-09-2018
01:22 PM
1 Kudo
@Derek Calderon - Short answer is no. The ExecuteSQL processor is written to write the output to the FlowFile's content. - There is an alternative solution. You have some processor currently feeding FlowFiles to your ExecuteSQL processor via a connection. My suggestion would be to feed that same connection to two different paths. The first connection feeds to a "MergeContent" processor via a funnel and the second feeds to your "ExecuteSQL" processor. The ExecuteSQL processor performs the query and retrieves the data you are looking for writing it to the content of the FlowFile. You then use a processor like "ExtractText" to extract that FlowFIles new content to FlowFile Attributes. Finally you use a processor like "ModifyBytes" to remove all content of this FlowFile. Finally you feed this processor to the same funnel as the other path. The MergeContent processor could then merge these two flowfiles using the "Correlation Attribute Name" property (assuming "filename" is unique, that could be used), min/max entries set to 2, and "Attribute Strategy" set to "Keep All Unique Attributes". The result should be what you are looking for. - Flow would look something like following: Having multiple identical connections does not trigger NiFi to write the 200 mb of content twice to the the content repository. a new FlowFile is created but it points to the sam content claim. New content is only generated when the executeSQL is run against one of the FlowFiles. So this flow does not produce any additional write load on the content repo other then when the executeSQL writes its output which i am assuming is relatively small? - Thank you, Matt
... View more
07-03-2018
09:21 PM
Jeez I would hope not, I'm not aware of any platform differences for Jayway (the underlying library used to do JSONPath stuff in NiFi)
... View more
06-29-2018
03:06 PM
As of NiFi 1.5.0 (via NIFI-4522), you can issue a SQL query in PutSQL while still retaining the incoming flow file contents. For your case, you could send the CSV file to PutSQL and execute a "CREATE TABLE IF NOT EXISTS" statement, which will create the table the first time but allow the CSV to proceed to the "real" destination processor, likely PutDatabaseRecord.
... View more
06-13-2018
03:18 PM
https://community.hortonworks.com/content/kbentry/109629/how-to-achieve-better-load-balancing-using-nifis-s.html
... View more