About sysadmin

sysadmin · ‎07-19-2017

I am trying to figure out the difference between df -h in linux and hdfs dfs -df -h / output. dfs output showing 44Tb across all cluster but root partition is not having the data it is always showing very less usage on each data node. hdfs@hdfs-xxxxx-xxxxxxx:~$ hdfs dfs -df -h / Filesystem Size Used Available Use% hdfs://hdfs-xxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020 44.8 T 20.6 T 6.8 T 46% hdfs@hdfs-xxxxxx-xxxxxx:~$ where as df -h hdfs@hdfs-XXXXX-XXXX:~$ df -h Filesystem Size Used Avail Use% Mounted on udev 32G 12K 32G 1% /dev tmpfs 6.3G 320K 6.3G 1% /run /dev/sda1 2.0T 8.8G 1.9T 1% / XXXXXXXXXXXXXXXXXXXXXXX 2.0T 1.8T 142G 93% /hadoop hdfs@hdfs-XXXXX-XXXXX:~$ dfsadmin report ================== Decommission Status : Normal Configured Capacity: 3785823326208 (3.44 TB) DFS Used: 1893513980859 (1.72 TB) Non DFS Used: 1370250273861 (1.25 TB) DFS Remaining: 302107447808 (281.36 GB) DFS Used%: 50.02% DFS Remaining%: 7.98% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) May I know where is the utilization is going on and why it is not showing on / partition.

sysadmin · ‎07-19-2017

@dsun I went through the above url already, came to know that we have configured 270G for dfs reserved space but it is taking Tb's on some of the servers, After analyzing the server we realized that most of the space is going to Map Reduce jobs. Is there any best tool/process to analyse to more.

sysadmin · ‎07-18-2017

@dsun Thanks for your suggestions, I tried to find the which directories are taking high amount of space, logs are not taking hig amount of space still Non DFS usage is very high in Tb's Thanks

sysadmin · ‎07-17-2017

Hello, We have 14 node cluster for hdp. Few servers have 4Tb diskspace and few servers have 2Tb diskspace. Using 14 nodes we got 44.8 TB diskspace for HDFS, In this Disk Usage (Non DFS Used)12.1 TB / 44.8 TB (27.04%). By this we are losing more amount of space with out keeping the data. I came to know that we can increase the DFS space to higher level by changing "Reserved space for HDFS" in Ambari config. Right now the value is "270415951872 bytes", what will be the value to get good amount of space. Is it necessary to keep 30% of space under Non DFS. Thanks in advance.

sysadmin · ‎06-22-2017

Thanks @Dongjoon Hyun

sysadmin · ‎06-19-2017

Hello, May I know in which section I need to add this parameter. spark.sql.broadcastTimeout Thanks

Online	Offline
Last Visited	‎08-09-2017 09:29 AM

Member Since	‎06-02-2017 01:12 PM
Last Visited	‎08-09-2017 09:29 AM
Posts	27
Kudos received	1

Cloudera Community

why hdfs dfs -df -h / output and df -h output are ...

Re: How to increase DFS space on existing cluster

Re: How to increase DFS space on existing cluster

How to increase DFS space on existing cluster

Re: how to increase spark.sql.broadcastTimeout ?

how to increase spark.sql.broadcastTimeout ?