Reply
Highlighted
Explorer
Posts: 8
Registered: ‎08-14-2017

4 data drives for hdfs - how to setup

[ Edited ]

hi

 

I have 5 slave machines, and 1 master machine, all running Centos.

I want to install yarn resource manager on the master. Master only has 16 G memory. Is this enough ?

My installation option is Hadoop+Spark with cloudera manager 5.13.

 

On each slave , there are 4 data drives, each mounted on a 2 TB disk.

Will cloudera manager install allow me to install HDFS on them. I cannot recall any step of the install where it asks me for HDFS install location.

 

On each machine , I want log files to go on another drive ( i.e. not the root or data).

Again, will cloudera manager install allow me to specify where logs should go ?

 

Are there any other precautions I must take before installing on a medium sized cluster [ my first time on a medium sized cluster ! ]

 

Thank you.

Explorer
Posts: 11
Registered: ‎06-06-2017

Re: 4 data drives for hdfs - how to setup

Memory capacity depends on your case. U know, NodeManager handle all real work .

For HDFS data folder ,you can search 'dfs.data.dir' in HDFS service.

For Log files, you can search 'log.dir' in HDFS service, you need configure log dir for each roles(datanode/namenode/history/.....)

If you want to use Spark On Yarn, you need make a good resource plan, you can look into Yarn Resource Pools.
Announcements