Reply
Explorer
Posts: 13
Registered: ‎01-24-2018

hdfs setup with cloudera manager on multiple disks

[ Edited ]

Hi

 

 

 

I have 5 computers, each with four  2TB disks, each mapped to a differnt drive.

 

I want to use cloudera manager to install Spark on my 5 machine cluster.

I want each computer to have 4*2= 8 TB hdfs storage.

 

How can I do this ? I have used cloudera manager before,  but only on machines with only 1 data drive.

What should I do now that I have 4 data drives on every machine?

 

Thank you.

Contributor
Posts: 29
Registered: ‎03-07-2017

Re: hdfs setup with cloudera manager on multiple disks

Assuming you will have HDFS service deployed on all 5 nodes, you can configure data directories through HDFS service on install, or post install through HDFS configuration. Go to Cloudera Manager -> HDFS -> Instances -> DataNode -> Configuration -> DataNode Data Directory, and add your mount points there for post install.  

Explorer
Posts: 13
Registered: ‎01-24-2018

Re: hdfs setup with cloudera manager on multiple disks

thank you RobertM for replying.

 

So you are saying that I can install Hadoop and Spark from Cloudera manager, just using one of the data drives as my HDFS storage.

After the install, I can simply add more local drives as HDFS drives from the Cloudera manager UI.

 

Is my understanding correct ?

Contributor
Posts: 29
Registered: ‎03-07-2017

Re: hdfs setup with cloudera manager on multiple disks

Yes, thats correct. If you are doing a fresh install then you can select which services to activate after the packages have been distributed. If you have already joined the cluster and under management with CM then you can add your additional drives through hdfs configuration. 

Highlighted
Explorer
Posts: 13
Registered: ‎01-24-2018

Re: hdfs setup with cloudera manager on multiple disks

thank you, I will try it out.

Announcements