Member since
08-01-2021
48
Posts
10
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1795 | 11-18-2022 09:06 AM | |
2267 | 11-15-2022 05:46 PM | |
1737 | 10-12-2022 03:18 AM | |
1176 | 10-11-2022 08:52 AM | |
3092 | 10-08-2022 08:23 AM |
07-31-2024
11:38 AM
@Green_ I had the same issue can you please share the detailed steps of how you resolved it? As I'm new to NiFi.
... View more
09-21-2023
12:42 AM
I am also experiencing this issue when attempting to write data to a redis in cluser mode. Did you find a solution or workaround @sofronic ?
... View more
12-12-2022
05:49 AM
Hello Eyal, thanks once again for your interest in helping out. I have not expressed myself correctly. I believe there is no backpressure, as I confirmed visually on the nifi UI. On the upper banner I see there are globally about 5000 flowfiles circulating at a particular time. I see some of the queues have 20, 50, 100, 200 flowfiles waiting at an instant. The queue that has more queued flow files is the kafka producer processor, that has about 2000. I believe these quantities are not sufficient to trigger backpressure-caused adjustment of scheduling of tasks. Apache Benchmark sends concurrent http requests. In this case, i set each of the 5 client instances to send each 1000 requests concurrently. I have thread pool sized to 96 and assigned 16 concurrent tasks to all of the critical path processors. Despite this, I get a low core load average (between 2 and 7), the disk's IOPS are in the dozens per second (the maximum that the disks allow is 3000). I'm all out of ideas at this point. 😄 Thanks for the support!
... View more
12-09-2022
02:00 PM
@F_Amini @Green_ Is absolutely correct here. You should be careful when increasing concurrent tasks as just blindly increasing it everywhere can have the opposite effect on throughput. I recommend stopping setting the concurrent tasks back to 1 or maybe 2 on all the processor where you have adjusted away from the default of 1 concurrent task. Then take a look at the processor further downstream in your dataflow where it has a red input connection but black (no backpressure) outbound connections. This processor s @Green_ mentioned is the processor causing all your upstream backlog. Your'll want to monitor your CPU usage as you make small incremental adjustments to this processors concurrent tasks until you see the upstream backlog start to come down. If while monitoring CPU, you see it spike pretty consistently at 100% usage across all your cores, then your dataflow has pretty much reached the max throughput it can handle for yoru specific dataflow design. At this point you need to look at other options like setting up a NiFi cluster where this work load can be spread across multiple servers or designing your datafow differently with different processors to accomplish same use case that may have a lesser impact on CPU (not always a possibility). Thanks, Matt
... View more
12-01-2022
02:33 AM
Hi, thanks for the details. Unfortunately it is not working. I get an empty array [] as output. I have tried it with extract and split mode. I applied the schema text property as suggested with "NestedKey" and "nestedValue" as name. None gives me an output. Meanwhile I have achieved what I wanted using SplitContent and then again another jolt processor. Of course it would be more elegant if I could make it work with ForkRecord.
... View more
11-25-2022
07:43 AM
Hi @ripo , I've tried it out in my own environment and it does not seem like ports have an 'ENABLED' state. If a port is DISABLED, you can only update it to be STOPPED. from a STOPPED state, you can either switch the port to RUNNING or back to DISABLED.
... View more
11-22-2022
09:43 PM
@Green_ Thank you so much ! It worked as expected.
... View more
11-19-2022
01:35 PM
I am bumping this question in hopes someone might know of a better solution
... View more
11-18-2022
08:25 AM
A couple of reasons: At the base level, operations in the UI all use the REST API - if you use chrome's devtools and go to the networking tab, you can see the exact REST API route used for every click you do on the canvas. If you'd ever want to automate something you know you can manually do in the UI, using this trick you can perfectly replicate it with with RESTful calls. In the same vain, the REST API has far more routes/operations available. The CLI tools seems to have around ~70 operations, whilst the rest api seems to have over 200. The CLI commands are part of the nifi toolkit. Whilst the toolkit does get updated, I believe it does not get the same level of attention as nifi's main codebase and as such it'd be better not to rely on it completely. This is not to say you can't use the CLI tool - rather, just my opinion on the matter and some insight for how my team writes automations on nifi 🙂
... View more
10-15-2022
07:09 PM
Hi Green Really appreciate the assistance, if anything its probably me not clearly articulating what I'm trying to achieve, two weeks in with nifi The raw content from the API is received as follows "{ "ret_code" : 0, "ret_msg" : "OK", "ext_code" : "", "ext_info" : "", "result" : [ { "id" : "187715622692", "symbol" : "BTCUSDT", "price" : 19109.5, "qty" : 0.004, "side" : "Buy", "time" : "2022-10-16T01:25:31.000Z", "trade_time_ms" : 1665883531832, "is_block_trade" : false }, { "id" : "187715618142", "symbol" : "BTCUSDT", "price" : 19109.5, "qty" : 0.882, "side" : "Buy", "time" : "2022-10-16T01:25:31.000Z", "trade_time_ms" : 1665883531123, "is_block_trade" : false }, { "id" : "187715614682", "symbol" : "BTCUSDT", "price" : 19109.5, "qty" : 0.001, "side" : "Buy", "time" : "2022-10-16T01:25:30.000Z", "trade_time_ms" : 1665883530414, "is_block_trade" : false" Likely I was trying to over engineer with the jolt, and ended up with { "id" : [ 1.84310970522E11, 1.84310967802E11 ], "symbol" : [ "BTCUSDT", "BTCUSDT" ], "price" : [ 19241.5, 19241.0 ], "qty" : [ 1.896, 0.002 ], "side" : [ "Buy", "Sell" ], "time" : [ "2022-10-11T12:26:21.000Z", "2022-10-11T12:26:21.000Z" ], "trade_time_ms" : [ 1.665491181666E12, 1.665491181604E12 ], "is_block_trade" : [ false, false ] } Where if I just used a json split, which gave the following { "id" : "187715622692", "symbol" : "BTCUSDT", "price" : 19109.5, "qty" : 0.004, "side" : "Buy", "time" : "2022-10-16T01:25:31.000Z", "trade_time_ms" : 1665883531832, "is_block_trade" : false } The format above was accepted by PutDatabaseRecord, where the issue moved to duplicate ID being written, setting ID as primary key seems have stopped the duplicate being written but is probably not the cleanest solution... The other API Im dealing with is where I think the solution you suggested with be the ideal use case... { "lastUpdateId" : 579582125552, "E" : 1665885750096, "T" : 1665885750088, "symbol" : "BTCUSD_PERP", "pair" : "BTCUSD", "bids" : [ [ "19139.5", "8824" ], [ "19139.4", "757" ] "asks" : [ [ "19139.6", "3165" ], [ "19139.7", "812" ] } Where the requirement would be to extract for example "bids" : [ [ "19139.5", "8824" ], [ "19139.4", "757" ] into two seperate record .. maintaining the fields in each
... View more