Created 03-07-2016 11:36 AM
I am posting this answer after searching in the internet for a good explanation. Currently the total physical hard disk space (4 nodes) is 720 GB. The dashboard currently shows that only 119 GB is configured for DFS. I want to increase this space to at last 300 GB. I didn't find anything staright forward on Ambari dashboard to do this. The only information I found on the internet is to modifify core-site.xml file to hav a property hadoop.tmp.dir pr that points to another directory. I do not want to blankly do it, without understanding what it means to be expanding HDFS capacity and how to do it through Ambari Dashboard.
Created 03-07-2016 11:50 AM
You add capacity by giving dfs.datanode.data.dir more mount points or directories. In Ambari that section of configs is I believe to the right depending the version of Ambari or in advanced section, the property is in hdfs-site.xml. the more new disk you provide through comma separated list the more capacity you will have. Preferably every machine should have same disk and mount point structure
Created 03-09-2016 08:52 AM
Excellent, glad it worked
Created 03-07-2016 12:00 PM
@Pradeep kumar See this thread https://community.hortonworks.com/questions/21212/configure-storage-capacity-of-hadoop-cluster.html
Read the comments under best answer
Created 03-08-2016 12:55 PM
@Neeraj Sabharwal I deleted my previous comment as it didn't make any sense. What I currently don't understand is that, the "DataNode directories" show /hadoop/hdfs/data. I am not able to change this. If I edit the field to remove this folder name, the "save" button gets disabled. It is not taking /home folder as a valid folder. The /home mount has the maximum space, and I am not able to mention this in the "DataNode directories" field. Any ideas?. Thanks.
Created 03-09-2016 09:01 AM
I am posting this so that it will be helpful for those users who are looking towards understanding how DFS capacity could be increased. I am providing the details in steps below.
1) The section "HDFS Disk usage" (a box) on the dashboard, shows the current DFS usage. However, the total DFS capacity is not shown here.
2) To view the total capacity use Name Node Web UI eg. (http://172.26.180.6:50070/). This will show you the total DFS capacity.
3) It is helpful to see the file system information by executing "df -h", which tells you the size of the file system. In my case the root file system had very less space allocated (50 GB) to it as compared to file system mounted on /home (750 GB).
4) The straight forward way to increase the DFS capacity is mention additional folder in the "DataNode directories" field under HDFS -> Configs -> Settings tab, as a comma separated value. This new folder should exist in a file system that has more disk capacity.
5) Ambari for some reason does not accept /home as the folder name for storing file blocks. By default it shows "/hadoop/hdfs/data. You cannot delete it completely to replace it with new folder path.
6) The best way is to create a new mount point and point it to a folder in the /home. Therefore create a mount point eg. Hdfsdata and then point it to a folder under home, eg. /home/hdfsdata. Following are the steps to create a new mount point:
After the above steps, restart the HDFS service and you have your capacity increased.
Created 10-07-2016 05:59 AM
I have done all the steps you have given above and I am facing an issue right now while restarting the HDFS service. Here is the log attached below.
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start datanode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000bc800000, 864026624, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 864026624 bytes for committing reserved memory. # An error report file with more information is saved as: # /var/log/hadoop/hdfs/hs_err_pid51884.log
Can you please look out at and tell me what exactly went wrong ?
Created 10-07-2016 06:05 AM
Previous comment was for namenode restart and this is I am showing you the datanode restart after allocating more memory to java.
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start datanode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out
@Pradeep kumar : Can you please have a look at both the log and help me out extending my current HDFS storage
Created 04-19-2017 01:12 AM
I had the same alert. The capacity of the DN(datanode) was somehow assigned to the very small space. After reading the threads here, I was about to create a partition and mount it on a new DN directory. However, since I did not have any issue to use the /hadoop/hdfs/data which is in /(root) directory, I tried to find other way around and found that the amount of "Reserved space for HDFS" under the advanced tab was huge almost take whole unused space of the root directory. After downsize of "Reserved space for HDFS", every alert was solved.
Created 04-19-2017 03:28 AM
That is a good point David Hwang. Thanks for sharing 🙂