Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN local and log storage discussion

YARN local and log storage discussion

New Contributor

Hi,

Actually we use a dedicated large disk on each datanode/nodemanger to host log and local files of running containers.

I read it is recommended to put YARN local and log files on multiple mount points and more precisely on all HDFS disks (to prevent I/O bottlenecked, and impact the whole nodemanger in case of disk failure)

I wonder if it's not dangerous for the HDFS in case the application log fill multiple HDFS mountpoints, what is the expected behavior of the HDFS service?

Thanks for your opinion and/or feedback,

Micaël

1 REPLY 1

Re: YARN local and log storage discussion

Cloudera Employee

@Micaël Dias

Yes, it is definitely recommended to put the YARN local & log directories on multiple disks for resiliency. Putting them all on a single disk means that when that disk fails, the corresponding node entirely becomes unusable for scheduling any more containers.

While you are in general right about potential impact of container local/log data with HDFS reads/writes, it tends to be minimal in practice because the container local/log data is very tiny compared to HDFS data being read/written.

Don't have an account?
Coming from Hortonworks? Activate your account here