01-24-2018 12:17 PM - last edited on 01-24-2018 03:53 PM by cjervis
I have 5 computers, each with four 2TB disks, each mapped to a differnt drive.
I want to use cloudera manager to install Spark on my 5 machine cluster.
I want each computer to have 4*2= 8 TB hdfs storage.
How can I do this ? I have used cloudera manager before, but only on machines with only 1 data drive.
What should I do now that I have 4 data drives on every machine?
01-24-2018 03:15 PM
Assuming you will have HDFS service deployed on all 5 nodes, you can configure data directories through HDFS service on install, or post install through HDFS configuration. Go to Cloudera Manager -> HDFS -> Instances -> DataNode -> Configuration -> DataNode Data Directory, and add your mount points there for post install.
01-24-2018 03:27 PM
thank you RobertM for replying.
So you are saying that I can install Hadoop and Spark from Cloudera manager, just using one of the data drives as my HDFS storage.
After the install, I can simply add more local drives as HDFS drives from the Cloudera manager UI.
Is my understanding correct ?
01-24-2018 03:30 PM
Yes, thats correct. If you are doing a fresh install then you can select which services to activate after the packages have been distributed. If you have already joined the cluster and under management with CM then you can add your additional drives through hdfs configuration.