Created 03-22-2016 07:41 AM
Hi All!
We're having error logs (see below) on our Nifi 0.4.0 on PutKafka Processor.
See our config
Mode: Synchronous
Memory Max Buffer: 2GB
Batch: 50
2016-03-21 23:00:08,269 ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.kafka.PutKafka PutKafka[id=7709ebdc-07df-4469-a5a3-2b2ec9b35c26] Successfully sent 2 messages to Kafka but failed to send 1 messages; the last error received was org.apache.kafka.clients.producer.BufferExhaustedException: You have exhausted the 4294967296 bytes of memory you configured for the client and the client is configured to error rather than block when memory is exhausted.: org.apache.kafka.clients.producer.BufferExhaustedException: You have exhausted the 4294967296 bytes of memory you configured for the client and the client is configured to error rather than block when memory is exhausted.
Any help is highly appreciated! Thanks in advance!
Created 03-22-2016 02:25 PM
This ERROR messages is informing you that the configured buffer in your putKafka processor was not large enough to accommodate the batch of files it wanted to transfer to Kafka. So the log above shows that a batch of 3 files was created, 2 of the files from that batch transferred successfully, and 1 file was routed to the putKafka's failure relationship. The total size of the batch was recorded as 4294967296 (4GB). These are very large files for Kafka... The Failure relationship should be looped back on to the putKafka processor so after a short penalization, the failed file will get re-transmitted. There are 4 settings at play here in the putKafka processor you will want to play around with.
Max Buffer Size: <-- max amount of reserved buffer space
Max Record Size: <-- max size of any one record
Batch Size: <-- max number of records to batch
Queue Buffering Max Time: <--- max amount of time spent on batching before transmitting.
*** The batch will be transmitted when either the Batch Size is satisfied or Queue Buffering Max time is reached.
Considering the size of the messages you are trying to send to your Kafka topic, I would recommend the following settings: Max Buffer Size: 2 GB
Max Record Size: 2 GB
Batch Size: 1
Queue Buffering Max Time: 100 ms Since you will be sending one file at a time, you may want to increase the number of Concurrent Tasks configured on the "Scheduling" tab of the putKafka processor. Only do this if the processor can not keep up with the flow of data. So start with the default of 1 and increase by only 1 at a time if needed. Keep in mind that the buffered records live in your JVM heap, so the the more concurrent tasks and the larger the Max Buffer Size configuration, the more heap this processor will use. Thanks,
Matt
Created 03-22-2016 02:25 PM
This ERROR messages is informing you that the configured buffer in your putKafka processor was not large enough to accommodate the batch of files it wanted to transfer to Kafka. So the log above shows that a batch of 3 files was created, 2 of the files from that batch transferred successfully, and 1 file was routed to the putKafka's failure relationship. The total size of the batch was recorded as 4294967296 (4GB). These are very large files for Kafka... The Failure relationship should be looped back on to the putKafka processor so after a short penalization, the failed file will get re-transmitted. There are 4 settings at play here in the putKafka processor you will want to play around with.
Max Buffer Size: <-- max amount of reserved buffer space
Max Record Size: <-- max size of any one record
Batch Size: <-- max number of records to batch
Queue Buffering Max Time: <--- max amount of time spent on batching before transmitting.
*** The batch will be transmitted when either the Batch Size is satisfied or Queue Buffering Max time is reached.
Considering the size of the messages you are trying to send to your Kafka topic, I would recommend the following settings: Max Buffer Size: 2 GB
Max Record Size: 2 GB
Batch Size: 1
Queue Buffering Max Time: 100 ms Since you will be sending one file at a time, you may want to increase the number of Concurrent Tasks configured on the "Scheduling" tab of the putKafka processor. Only do this if the processor can not keep up with the flow of data. So start with the default of 1 and increase by only 1 at a time if needed. Keep in mind that the buffered records live in your JVM heap, so the the more concurrent tasks and the larger the Max Buffer Size configuration, the more heap this processor will use. Thanks,
Matt
Created 03-22-2016 02:27 PM
There have also been many improvements to the underlying code for the Kafka processors in newer releases of NiFi. I recommend upgrading.
Created 03-23-2016 12:31 AM
Thank you @mclark, for the very detailed answer-recommendation. We'll try your recommendation today and will post results.
Kudos!
Created 03-27-2016 11:31 PM
As the answer suggests, you are out of memory. You could try breaking up your batch of data using one of the processors designed for splitting up batches.