Support Questions

Find answers, ask questions, and share your expertise

Enable logging of hadoop logs to hdfs mounted as nfs on the data nodes

avatar
Contributor

We are constantly running out of space on the hadoop nodes.

Is it recommended to enable logging of hadoop logs to hdfs mounted as nfs on the data nodes?

Or is it better to mount a nap drive to the nodes for storing log files.

Is there any challenges?

1 ACCEPTED SOLUTION

avatar

Hi @S Roy, using hdfs mounted as nfs would be a bad idea. An HDFS service writing its own logs to HDFS could deadlock on itself.

As @Neeraj Sabharwal suggested, a local disk is best to make sure the logging store does not become a performance bottleneck. You can change the log4j settings to limit the size and number of the log files thus capping total space used by log files. Also you can write a separate daemon to periodically copy log files to HDFS for long term archival.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

you can zip those logs once in a while @S Roy

avatar
Master Mentor

@S Roy

I always suggest to have dedicated log disk or mount of size around 200gb in each node

/usr/hdp for binaries around 50gb

/var/log 150 to 200gb

avatar
Contributor

What about NAS storage mounted ?

avatar
Master Mentor

@S Roy

I try to stay away from NAS because when it comes to performance then it can be a road block or misguide

Lab or demo ..sure

But not for prod

avatar

Hi @S Roy, using hdfs mounted as nfs would be a bad idea. An HDFS service writing its own logs to HDFS could deadlock on itself.

As @Neeraj Sabharwal suggested, a local disk is best to make sure the logging store does not become a performance bottleneck. You can change the log4j settings to limit the size and number of the log files thus capping total space used by log files. Also you can write a separate daemon to periodically copy log files to HDFS for long term archival.