Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Configure Storage capacity of Hadoop cluster

Solved Go to solution

Re: Configure Storage capacity of Hadoop cluster

Mentor

You need to pick just one there, especially don't choose /tmp as your parent dir, asking for trouble

Re: Configure Storage capacity of Hadoop cluster

Contributor

Like you said, i removed /tmp from that directories list and the capacity of all the nodes reduced to 40 gb or less including slave 4

Re: Configure Storage capacity of Hadoop cluster

Expert Contributor

@vinay kumar Can you help me understand where did you find the 'dfs.datanode.data.dir' property?. In my Ambari Installation, I did not find this property under 'Advanced hdfs-site' configuration.

Re: Configure Storage capacity of Hadoop cluster

@vinay kumar

I have never seen the same number for all the slave nodes because of the data distribution.

Link

To overcome uneven block distribution scenario across the cluster, a utility program called balancer

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer

Re: Configure Storage capacity of Hadoop cluster

Contributor

I will make it clear.The Cluster is new and it don't have much data in it. As per my understanding, Available capacity is the storage available for data node(hdfs) , if i am not wrong. The actual hard disk size of each node being 500 GB and the available capacity for 5 of them is far to less than the slave 4.root folder disk capacity has more than 400gb space allocated and the same should be allocated to hdfs. My concern is where did the rest of the space go? How did the distribution thing come here when my only concern is about hdfs capacity. PFA.

2649-df.png

2637-df-h.png

Re: Configure Storage capacity of Hadoop cluster

@vinay kumar What do you have for dfs.datanode.data.dir?

slave 4 --> / ?

Rest of the nodes are using other mounts...

Re: Configure Storage capacity of Hadoop cluster

Contributor

dfs.datanode.data.dir have: /opt/hadoop/hdfs/data,/tmp/hadoop/hdfs/data,/usr/hadoop/hdfs/data,/usr/local/hadoop/hdfs/data

All the nodes have same mounts.

Re: Configure Storage capacity of Hadoop cluster

Contributor

@Neeraj Sabharwal Did say anything wrong ? So the capacity is the Space allocated to hdfs right?

Re: Configure Storage capacity of Hadoop cluster

@vinay kumar

As expected, the problem is with the disks allocated to datanode settings.

Ambari picks up all the mounts except /boot and /mnt

You were suppose to modify the settings during the installs. As you can see, data is going on /opt and other mounts and you were suppose to give only /hadoop " / has 400GB"

Now , there is no way we want to store the data on /tmp

/opt/hadoop/hdfs/data,/tmp/hadoop/hdfs/data,/usr/hadoop/hdfs/data,/usr/local/hadoop/hdfs/data

You need to create a directory as /hadoop and modify the settings to read the data from /hadoop.

Highlighted

Re: Configure Storage capacity of Hadoop cluster

Contributor
@Neeraj Sabharwal

Does this mean that i should re-install everything?? I am still wondering about how only slave-4 got a capacity of 435 GB when every node have have configuration and same mounts.

2654-slave4.png