Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Configuring Volumes for DFS Metadata and Zookeeper Data

Configuring Volumes for DFS Metadata and Zookeeper Data

New Contributor

I am using Cloudera Director and Cloudera Manager to create a new CDH cluster in AWS. I was looking over this document, "Cloudera Enterprise Reference Architecture for AWS Deployments", and it states that " Cloudera requires using GP2 volumes when deploying to EBS-backed masters, one each dedicated for DFS metadata and ZooKeeper data." I see that I can attach additional EBS volumes when creating the cluster in Cloudera Director, but once I have added 2 volumes, how can I choose to use one for DFS and one for Zookeeper?

 

Thanks

2 REPLIES 2

Re: Configuring Volumes for DFS Metadata and Zookeeper Data

Champion
I don't know Director well enough but maybe this will get you to where you need to go.

Each EBS vol should be mounted to its own path. As an example say you mount them as /data1 and /data2. In the HDFS configuration you would set dfs.namenode.name.dir to /data1/dfs/nn. In the ZK configs you would set dataDir and dataLogDir to /data1/zk.

The paths and names are made you. You could use the defaults which are /hadoop/dfs/nn and /var/lib/zookeeper and just mount to those paths directly.

Re: Configuring Volumes for DFS Metadata and Zookeeper Data

New Contributor

Thanks, worked perfectly. I was able to easily change the directories for those configurations in the Cloudera Manager. As a note for anyone else looking into this, the owner of the /data/dfs/nn dir needs to be hdfs.hadoop and the owner of the /data/zookeeper dir needs to be zookeeper.zookeeper. I had to make the zookeeper directory manually and also needed to add the dir "version-2" inside the zookeeper directory manually.