Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

dfs.data.dir question

SOLVED Go to solution

dfs.data.dir question

New Contributor

If I have several slave nodes with varying number of drives mounted in each of them, then how does dfs.data.dir property setup works? If I include the directory name that is available in one of the datanode that is not available in other in this property, then does hadoop skip this value/drive for that particular datanode where this directory is missing?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: dfs.data.dir question

Explorer

No, this isn't how it works.  If you use the same configuration:

 

/mnt, /mnt2, /mnt3, /mnt4, /mnt5

 

A host who just has 3 drives (/mnt, /mnt2, /mnt3) will fail to start, depending on the value of dfs.datanode.failed.volumes.tolerated (default 0).  You're going to need to set up each server properly with the right value for dfs.data.dir.  For this (and other) reason(s), a homogenous cluster setup is usually preferred.

Bryan Beaudreault
Senior Technical Lead, Data Ops
HubSpot, Inc
1 REPLY 1

Re: dfs.data.dir question

Explorer

No, this isn't how it works.  If you use the same configuration:

 

/mnt, /mnt2, /mnt3, /mnt4, /mnt5

 

A host who just has 3 drives (/mnt, /mnt2, /mnt3) will fail to start, depending on the value of dfs.datanode.failed.volumes.tolerated (default 0).  You're going to need to set up each server properly with the right value for dfs.data.dir.  For this (and other) reason(s), a homogenous cluster setup is usually preferred.

Bryan Beaudreault
Senior Technical Lead, Data Ops
HubSpot, Inc