Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Enable logging of hadoop logs to hdfs mounted as nfs on the data nodes

avatar
Contributor

We are constantly running out of space on the hadoop nodes.

Is it recommended to enable logging of hadoop logs to hdfs mounted as nfs on the data nodes?

Or is it better to mount a nap drive to the nodes for storing log files.

Is there any challenges?

1 ACCEPTED SOLUTION

avatar

Hi @S Roy, using hdfs mounted as nfs would be a bad idea. An HDFS service writing its own logs to HDFS could deadlock on itself.

As @Neeraj Sabharwal suggested, a local disk is best to make sure the logging store does not become a performance bottleneck. You can change the log4j settings to limit the size and number of the log files thus capping total space used by log files. Also you can write a separate daemon to periodically copy log files to HDFS for long term archival.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

you can zip those logs once in a while @S Roy

avatar
Master Mentor

@S Roy

I always suggest to have dedicated log disk or mount of size around 200gb in each node

/usr/hdp for binaries around 50gb

/var/log 150 to 200gb

avatar
Contributor

What about NAS storage mounted ?

avatar
Master Mentor

@S Roy

I try to stay away from NAS because when it comes to performance then it can be a road block or misguide

Lab or demo ..sure

But not for prod

avatar

Hi @S Roy, using hdfs mounted as nfs would be a bad idea. An HDFS service writing its own logs to HDFS could deadlock on itself.

As @Neeraj Sabharwal suggested, a local disk is best to make sure the logging store does not become a performance bottleneck. You can change the log4j settings to limit the size and number of the log files thus capping total space used by log files. Also you can write a separate daemon to periodically copy log files to HDFS for long term archival.