Support Questions

Find answers, ask questions, and share your expertise

spark history server event log question

avatar
Explorer

Spark job is writing  event log to  hdfs://namenode:8021/spark-history , but our job creates many events within  10  to 12 minutes.

As per the document spark.eventLog.buffer.kb  =100kb

Does it mean when event writes to /spark-history/application_xxxxxx_xx/xx  ,  whenever buffer gets 100kb?  It means every time it is going to call fsync?

 

spark.eventLog.buffer.kb100kBuffer size to use when writing to output streams, in KiB unless otherwise specified.
1 REPLY 1

avatar
Contributor

Hello @mokkan

Yes, that buffer means that it will wait until it reaches the 100kb and will write it to the location under HDFS. 
If you need to get the events before, you can decrease the buffer size. 

Not all the times the buffer is reached it will perform the fsync, 
According to what I found, Spark uses OutputStream.flush(), so not all the times an fsync() will be performed. 

Regards, 
Andrés Fallas
Cloudera Customer Operations Engineer


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.