Member since
06-02-2017
27
Posts
1
Kudos Received
0
Solutions
07-19-2017
05:33 PM
I am trying to figure out the difference between df -h in linux and hdfs dfs -df -h / output. dfs output showing 44Tb across all cluster but root partition is not having the data it is always showing very less usage on each data node. hdfs@hdfs-xxxxx-xxxxxxx:~$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://hdfs-xxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020 44.8 T 20.6 T 6.8 T 46%
hdfs@hdfs-xxxxxx-xxxxxx:~$
where as df -h
hdfs@hdfs-XXXXX-XXXX:~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 32G 12K 32G 1% /dev
tmpfs 6.3G 320K 6.3G 1% /run
/dev/sda1 2.0T 8.8G 1.9T 1% /
XXXXXXXXXXXXXXXXXXXXXXX 2.0T 1.8T 142G 93% /hadoop
hdfs@hdfs-XXXXX-XXXXX:~$
dfsadmin report
==================
Decommission Status : Normal
Configured Capacity: 3785823326208 (3.44 TB)
DFS Used: 1893513980859 (1.72 TB)
Non DFS Used: 1370250273861 (1.25 TB)
DFS Remaining: 302107447808 (281.36 GB)
DFS Used%: 50.02%
DFS Remaining%: 7.98%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
May I know where is the utilization is going on and why it is not showing on / partition.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
07-19-2017
07:12 AM
@dsun I went through the above url already, came to know that we have configured 270G for dfs reserved space but it is taking Tb's on some of the servers, After analyzing the server we realized that most of the space is going to Map Reduce jobs. Is there any best tool/process to analyse to more.
... View more
07-18-2017
12:53 PM
@dsun Thanks for your suggestions, I tried to find the which directories are taking high amount of space, logs are not taking hig amount of space still Non DFS usage is very high in Tb's Thanks
... View more
07-17-2017
10:50 AM
Hello, We have 14 node cluster for hdp. Few servers have 4Tb diskspace and few servers have 2Tb diskspace. Using 14 nodes we got 44.8 TB diskspace for HDFS, In this Disk Usage (Non DFS Used)12.1 TB / 44.8 TB (27.04%). By this we are losing more amount of space with out keeping the data. I came to know that we can increase the DFS space to higher level by changing "Reserved space for HDFS" in Ambari config. Right now the value is "270415951872 bytes", what will be the value to get good amount of space. Is it necessary to keep 30% of space under Non DFS. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
06-22-2017
06:18 AM
Thanks @Dongjoon Hyun
... View more
06-19-2017
02:27 PM
1 Kudo
Hello, May I know in which section I need to add this parameter. spark.sql.broadcastTimeout Thanks
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Spark