Support Questions

Find answers, ask questions, and share your expertise

Location in disk where data is flushed Kafka

avatar
New Contributor

I have a basic kafka 2.13_3.10 cluster with one broker, consumer and producer.

I was testing if the "log.flush.interval.messages = 10"

worked properly. It is supposed to write the kafka messages of the topics after 10 messages, but where are they saved in the disk by default??

And, is there any way to specify the directory where you want data to be flushed overriding the default configuration?

Thanks in advance.

2 ACCEPTED SOLUTIONS

avatar
Super Guru

@PabloO ,

 

The logs are written to the directories configured in the log.dirs and/or log.dir properties of the Kafka broker. You can modify those properties to configured to the broker to use the directory that you want.

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

avatar
Super Guru

@PabloO ,

 

In Kafka's terminology, a topic is a "distributed logs". The data for each topic's partitions is saved in what's called "log segment files".

 

So, the "log.dirs" and "log.dir" parameters point to the directories where the actual messages are saved, *not* the "error logs".

 

For example, if your "log.dirs" is set to "/data1" and you have a topic named "mytopic". The data for the partition 0 of that topic will be saved in files under the directory "/data1/mytopic-0".

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

3 REPLIES 3

avatar
Super Guru

@PabloO ,

 

The logs are written to the directories configured in the log.dirs and/or log.dir properties of the Kafka broker. You can modify those properties to configured to the broker to use the directory that you want.

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

@araujo Thanks for the answer. I am not saying where the logs are saved but where the data is saved when it is flushed. In the documentation of the log.flush.interval.messages says "The number of messages accumulated on a log partition before messages are flushed to disk". What I refer in this question is about the location where the messages are flushed to disk after receiving 10 messages (in the case of the previous example). At first I thought the properties you are talking about could be also used to specify the directory where I wanted the messages to be flushed to, but they aren't.



 

avatar
Super Guru

@PabloO ,

 

In Kafka's terminology, a topic is a "distributed logs". The data for each topic's partitions is saved in what's called "log segment files".

 

So, the "log.dirs" and "log.dir" parameters point to the directories where the actual messages are saved, *not* the "error logs".

 

For example, if your "log.dirs" is set to "/data1" and you have a topic named "mytopic". The data for the partition 0 of that topic will be saved in files under the directory "/data1/mytopic-0".

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.