Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

4 data drives for hdfs - how to setup

4 data drives for hdfs - how to setup

Explorer

hi

 

I have 5 slave machines, and 1 master machine, all running Centos.

I want to install yarn resource manager on the master. Master only has 16 G memory. Is this enough ?

My installation option is Hadoop+Spark with cloudera manager 5.13.

 

On each slave , there are 4 data drives, each mounted on a 2 TB disk.

Will cloudera manager install allow me to install HDFS on them. I cannot recall any step of the install where it asks me for HDFS install location.

 

On each machine , I want log files to go on another drive ( i.e. not the root or data).

Again, will cloudera manager install allow me to specify where logs should go ?

 

Are there any other precautions I must take before installing on a medium sized cluster [ my first time on a medium sized cluster ! ]

 

Thank you.

1 REPLY 1
Highlighted

Re: 4 data drives for hdfs - how to setup

Explorer
Memory capacity depends on your case. U know, NodeManager handle all real work .

For HDFS data folder ,you can search 'dfs.data.dir' in HDFS service.

For Log files, you can search 'log.dir' in HDFS service, you need configure log dir for each roles(datanode/namenode/history/.....)

If you want to use Spark On Yarn, you need make a good resource plan, you can look into Yarn Resource Pools.