Created 03-07-2016 11:36 AM
I am posting this answer after searching in the internet for a good explanation. Currently the total physical hard disk space (4 nodes) is 720 GB. The dashboard currently shows that only 119 GB is configured for DFS. I want to increase this space to at last 300 GB. I didn't find anything staright forward on Ambari dashboard to do this. The only information I found on the internet is to modifify core-site.xml file to hav a property hadoop.tmp.dir pr that points to another directory. I do not want to blankly do it, without understanding what it means to be expanding HDFS capacity and how to do it through Ambari Dashboard.
Created 03-07-2016 11:50 AM
You add capacity by giving dfs.datanode.data.dir more mount points or directories. In Ambari that section of configs is I believe to the right depending the version of Ambari or in advanced section, the property is in hdfs-site.xml. the more new disk you provide through comma separated list the more capacity you will have. Preferably every machine should have same disk and mount point structure
Created 03-07-2016 11:50 AM
You add capacity by giving dfs.datanode.data.dir more mount points or directories. In Ambari that section of configs is I believe to the right depending the version of Ambari or in advanced section, the property is in hdfs-site.xml. the more new disk you provide through comma separated list the more capacity you will have. Preferably every machine should have same disk and mount point structure
Created 03-07-2016 01:41 PM
@Artem Ervits Can you please elaborate what you what you mean by "right spending of version of ambari". I checked "Advanced hdfs-site" section, but I dont see any "dfs.datanode.data.dir"
Created 03-07-2016 01:49 PM
sorry auto-correct on my tablet. @Pradeep kumar I updated the answer with correct spelling.
Created 03-07-2016 01:56 PM
@Artem Ervits. Thanks, but I still could not find this property under "Advnce hdfs-site" section. I was reading the link provided by Neeraj Sabharwal, in his answer below, which also talks about mentioning /hadoop as the folder in the property 'dfs.datanode.data.dir'. But, like I said, I could not find this property.
Created 03-08-2016 11:04 AM
@Artem Ervits I found "Data Node Directories" under "Data Node" section under "Settings" tab. The "Data Node Directories" has the folder name /hadoop/hdfs/data. However, when I do df -h, I do not see this folder in the mount information. Following is the output of my the df -h on the master server:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_item70288-lv_root 50G 41G 6.2G 87% /
tmpfs 3.8G 0 3.8G 0% /dev/shm
/dev/sda1 477M 67M 385M 15% /boot
/dev/mapper/vg_item70288-lv_home 172G 21G 143G 13% /home
Created 03-08-2016 11:12 AM
That's the problem, you need to replace that path with correct path that does exist, otherwise data is being written to filesystem and run out of space quickly
Created 03-08-2016 11:34 AM
@Artem Ervits: I am having many issues now. 1) Ambari doesn't allow me to remove folder name "/hadoop/hdfs/data". So I cannot completely replace it with a new folder. 2) If I give /hadoop/hdfs/data,/home then it shows me error Can't start with "home(s)". I am pretty sure something is wrong.
Created 03-08-2016 01:37 PM
Create mount point /hadoop/ pointing to your large disk
Created 03-09-2016 08:49 AM
@Artem Ervits: Okay. I have finally got what I wanted and I have increased the DFS capacity. Thanks for your help. I learned a lot through this exercise :). I am accepting your answer and also providing steps that I followed in another answer post, so that it will be helpful to other users.