Created 06-25-2018 02:17 PM
from HDFS user I do the following:
$ hdfs dfs -du -h / 512.5 M /app-logs 48.8 M /apps 4.2 M /ats 695.6 M /hdp 0 /mapred 56.0 G /datag 0 /history 5.3 M /spark2-history 0 /system 0 /tmp 465.6 M /user
so all size seems to be around 60G
but from ambari dashboard
we see that ( DFS used 188G )
so how it can be that in ambari dasboard we have used 188G ?
Created 06-25-2018 09:35 PM
@Michael Bronson can you show me the output of command "hdfs dfsadmin -report" ? Also from "hdfs dfs -df -h /" ?
2 things to consider:
First the replication factor must be accounted for. ( the correct space usage must be in the outputs above)
And second, to know the size consumed the best is to calculate the number of blocks used * block size.
What can happen is If you have a large amount of files that are smaller than the configured block size, you can be wasting some space. Imagine having a 128MB block space and writing a huge amount of 20MB files to the DFS. These files would be using only a fraction of the total block space. This is the problem with small files
Created 06-26-2018 05:38 AM
this is the report
[hdfs@master02 root]$ hdfs dfsadmin -report Configured Capacity: 205428162560 (191.32 GB) Present Capacity: 204711643765 (190.65 GB) DFS Remaining: 2135187357 (1.99 GB) DFS Used: 202576456408 (188.66 GB) DFS Used%: 98.96% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (5): Name: 10.164.47.218:50010 (worker01.sys764.com) Hostname: worker01.sys764.com Decommission Status : Normal Configured Capacity: 41085632512 (38.26 GB) DFS Used: 40449814998 (37.67 GB) Non DFS Used: 0 (0 B) DFS Remaining: 476863314 (454.77 MB) DFS Used%: 98.45% DFS Remaining%: 1.16% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 4 Last contact: Tue Jun 26 05:32:13 UTC 2018 Name: 10.164.48.3:50010 (worker05.sys764.com) Hostname: worker05.sys764.com Decommission Status : Normal Configured Capacity: 41085632512 (38.26 GB) DFS Used: 40386752982 (37.61 GB) Non DFS Used: 0 (0 B) DFS Remaining: 551342871 (525.80 MB) DFS Used%: 98.30% DFS Remaining%: 1.34% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 4 Last contact: Tue Jun 26 05:32:13 UTC 2018 Name: 10.164.47.217:50010 (worker02.sys764.com) Hostname: worker02.sys764.com Decommission Status : Normal Configured Capacity: 41085632512 (38.26 GB) DFS Used: 40588859180 (37.80 GB) Non DFS Used: 0 (0 B) DFS Remaining: 347038972 (330.96 MB) DFS Used%: 98.79% DFS Remaining%: 0.84% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Tue Jun 26 05:32:13 UTC 2018 Name: 10.164.47.215:50010 (worker03.sys764.com) Hostname: worker03.sys764.com Decommission Status : Normal Configured Capacity: 41085632512 (38.26 GB) DFS Used: 40671485952 (37.88 GB) Non DFS Used: 0 (0 B) DFS Remaining: 346956888 (330.88 MB) DFS Used%: 98.99% DFS Remaining%: 0.84% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Tue Jun 26 05:32:15 UTC 2018 Name: 10.164.47.223:50010 (worker04.sys764.com) Hostname: worker04.sys764.com Decommission Status : Normal Configured Capacity: 41085632512 (38.26 GB) DFS Used: 40479543296 (37.70 GB) Non DFS Used: 0 (0 B) DFS Remaining: 412985312 (393.85 MB) DFS Used%: 98.52% DFS Remaining%: 1.01% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 4 Last contact: Tue Jun 26 05:32:13 UTC 2018
Created 06-26-2018 02:34 AM
Have you recently deleted large contents from your HDFS without using the " -skipTrash" option with "hdfs dfs rm" command?
Also what do you see for the DFS / Non DFS usage when you make the following JMX call in the browser?
http://$ACTIVE_NameNode_HOST:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo
Have you checked the "~/.Trash" directory of your HDFS users? like
# hdfs dfs -du -s -h /user/admin/.Trash
.
Have you tried doing expunge? Like
# su - hdfs # hdfs dfs -rm /user/admin/.Trash/* # hdfs dfs -expunge
.
Created 06-26-2018 05:33 AM
hi Jay
I see that .Trash folder not exists , is it part of the problem ?
how to create it?
dfs -rm /user/admin/.Trash/* rm: `/user/admin/.Trash/*': No such file or directory [hdfs@master02 root]$ hdfs dfs -ls /user Found 7 items drwxr-xr-x - root hdfs 0 2018-01-04 15:16 /user/admin drwxr-xr-x - airflow hdfs 0 2017-09-07 02:12 /user/airflow drwxrwx--- - ambari-qa hdfs 0 2017-08-14 09:19 /user/ambari-qa drwxr-xr-x - hcat hdfs 0 2017-08-14 09:19 /user/hcat drwxr-xr-x - hdfs hdfs 0 2018-02-15 13:11 /user/hdfs drwxr-xr-x - hive hdfs 0 2017-11-03 00:01 /user/hive drwxrwxr-x - spark hdfs 0 2017-08-14 09:20 /user/spark [hdfs@master02 root]$ hdfs dfs -ls /user/admin ( admin folder exist )
Created 06-26-2018 06:04 AM
the output from http://$ACTIVE_NameNode_HOST:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo
<br>{ "beans" : [ { "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem", "Threads" : 132, "Total" : 205428162560, "UpgradeFinalized" : true, "ClusterId" : "CID-bc106817-c4d9-4b79-a9cf-5b1e37fa38c6", "Version" : "2.7.3.2.6.0.3-8, rc6befa0f1e911140cc815e0bab744a6517abddae", "Used" : 202576838656, "Free" : 2134805109, "Safemode" : "", "NonDfsUsedSpace" : 0, "PercentUsed" : 98.61201, "BlockPoolUsedSpace" : 202576838656, "PercentBlockPoolUsed" : 98.61201, "PercentRemaining" : 1.0391978, "CacheCapacity" : 0, "CacheUsed" : 0, "TotalBlocks" : 756250, "TotalFiles" : 32767, "NumberOfMissingBlocks" : 0, "NumberOfMissingBlocksWithReplicationFactorOne" : 0, "LiveNodes" : "{\"worker01.sys764.com:50010\":{\"infoAddr\":\"43.56.2.218:50075\",\"infoSecureAddr\":\"43.56.2.218:0\",\"xferaddr\":\"43.56.2.218:50010\",\"lastContact\":1,\"usedSpace\":40449945600,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":0,\"capacity\":41085632512,\"numBlocks\":417031,\"version\":\"2.7.3.2.6.0.3-8\",\"used\":40449945600,\"remaining\":476732712,\"blockScheduled\":0,\"blockPoolUsed\":40449945600,\"blockPoolUsedPercent\":98.452774,\"volfails\":0},\"worker04.sys764.com:50010\":{\"infoAddr\":\"43.56.2.223:50075\",\"infoSecureAddr\":\"43.56.2.223:0\",\"xferaddr\":\"43.56.2.223:50010\",\"lastContact\":2,\"usedSpace\":40479584256,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":0,\"capacity\":41085632512,\"numBlocks\":467826,\"version\":\"2.7.3.2.6.0.3-8\",\"used\":40479584256,\"remaining\":412944352,\"blockScheduled\":0,\"blockPoolUsed\":40479584256,\"blockPoolUsedPercent\":98.52492,\"volfails\":0},\"worker02.sys764.com:50010\":{\"infoAddr\":\"43.56.2.217:50075\",\"infoSecureAddr\":\"43.56.2.217:0\",\"xferaddr\":\"43.56.2.217:50010\",\"lastContact\":2,\"usedSpace\":40588972032,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":0,\"capacity\":41085632512,\"numBlocks\":457048,\"version\":\"2.7.3.2.6.0.3-8\",\"used\":40588972032,\"remaining\":346926120,\"blockScheduled\":0,\"blockPoolUsed\":40588972032,\"blockPoolUsedPercent\":98.79116,\"volfails\":0},\"worker03.sys764.com:50010\":{\"infoAddr\":\"43.56.2.215:50075\",\"infoSecureAddr\":\"43.56.2.215:0\",\"xferaddr\":\"43.56.2.215:50010\",\"lastContact\":0,\"usedSpace\":40671485952,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":0,\"capacity\":41085632512,\"numBlocks\":473548,\"version\":\"2.7.3.2.6.0.3-8\",\"used\":40671485952,\"remaining\":346956888,\"blockScheduled\":0,\"blockPoolUsed\":40671485952,\"blockPoolUsedPercent\":98.992,\"volfails\":0},\"worker05.sys764.com:50010\":{\"infoAddr\":\"10.164.48.3:50075\",\"infoSecureAddr\":\"10.164.48.3:0\",\"xferaddr\":\"10.164.48.3:50010\",\"lastContact\":2,\"usedSpace\":40386850816,\"adminState\":\"In Service\",\"nonDfsUsedSpace\":0,\"capacity\":41085632512,\"numBlocks\":453297,\"version\":\"2.7.3.2.6.0.3-8\",\"used\":40386850816,\"remaining\":551245037,\"blockScheduled\":0,\"blockPoolUsed\":40386850816,\"blockPoolUsedPercent\":98.29921,\"volfails\":0}}", "DeadNodes" : "{}", "DecomNodes" : "{}", "BlockPoolId" : "BP-1686071471-43.56.2.214-1502702329154", "NameDirStatuses" : "{\"active\":{\"/data/var/hadoop/hdfs/namenode\":\"IMAGE_AND_EDITS\"},\"failed\":{}}", "NodeUsage" : "{\"nodeUsage\":{\"min\":\"98.30%\",\"median\":\"98.52%\",\"max\":\"98.99%\",\"stdDev\":\"0.25%\"}}", "NameJournalStatus" : "[{\"manager\":\"QJM to [43.56.2.214:8485, 10.164.52.237:8485, 43.56.2.216:8485]\",\"stream\":\"open for read\",\"disabled\":\"false\",\"required\":\"true\"}]", "JournalTransactionInfo" : "{\"MostRecentCheckpointTxId\":\"81912212\",\"LastAppliedOrWrittenTxId\":\"81943698\"}", "NNStarted" : "Tue Jun 12 12:27:18 UTC 2018", "CompileInfo" : "2017-04-01T21:32Z by jenkins from (HEAD detached at c6befa0)", "CorruptFiles" : "[]", "DistinctVersionCount" : 1, "DistinctVersions" : [ { "key" : "2.7.3.2.6.0.3-8", "value" : 5 } ], "SoftwareVersion" : "2.7.3.2.6.0.3-8", "RollingUpgradeStatus" : null } ] }
Created 06-26-2018 06:11 AM
I was going through your "capture.png" image and noticed that
Non DFS Used = 0 % (which is strange) and looks similar to Ambari-22625
For the Non DFS usage that you see as 0% is actually looks like a Bug reported here: https://issues.apache.org/jira/browse/AMBARI-22625
Apart from non DFS usage other data seems to be matching what NameNode jmx and the screenshot returns: like:
Total : 205428162560 (205.42 GB) Used : 202576838656 (202.57 GB) "PercentUsed" : 98.61201 (98%)
Created 06-26-2018 06:13 AM
@Jay for now HDFS is 99% , what we can do to deacrease it ?
Created 06-26-2018 06:18 AM
If you have almost 99% HDFS directory filled then better to try this:
1. Find top 10 HDFS directories using the script as shown here to see which HDFS directory is consuming how much space:
https://github.com/crazyadmins/useful-scripts/tree/master/hdfs
https://github.com/crazyadmins/useful-scripts/blob/master/hdfs/top_10_dir.sh
2. The delete some unwanted directory contents from the output listed in the above commands. and while removing the HDFS contents please use "-skipTrash" this time to make sure that the contents are not moved to the trash rather deleted permanently.
Example:
# hdfs dfs -rmr -skipTrash /unwanted/files<br>
.
Created 06-26-2018 06:14 AM
we create the Trash folder as the following: ( from user HDFS )
hdfs dfs -mkdir /user/admin/.Trash
Created 06-26-2018 07:31 AM
we see only that from the script
Please wait while we calculate size to determine top 10 directories on HDFS | --------------------------- | ------- | ------------ | --------- | ---------- ------ | | Dir_on_HDFS | Size_in_MB | User | Group | Last_modified Time | | --------------------------- | ------- | ------------ | --------- | ---------- ------ | | /apps/hive/warehouse | 0 | hive | hadoop | 2018-05-31 09:04 | | /spark2-history/application_1527757840137_0005.inprogress | 0 | hive | hadoop | 2018-06-18 14:49 | | /spark2-history/application_1527757840137_0006.inprogress | 0 | hive | hadoop | 2018-06-18 14:49 | | /spark2-history/application_1527757840137_0003.inprogress | 0 | hive | hadoop | 2018-06-18 15:47 | | /user/hdfs/.hiveJars | 21 | hdfs | hdfs | 2018-06-06 08:10 | | /hdp/apps/2.6.4.0-91 | 710 | hdfs | hdfs | 2018-05-31 09:04 |
Created 06-26-2018 07:32 AM
and I dont think we can delete the - /hdp/apps/2.6.4.0-91
Created 06-26-2018 08:30 AM
We do not need to delete "/hdp/apps/2.6.4.0-91" directory as based on the output shared it is just consuming 738 MB which is normal and OK.
If you see some directories are growing unexpectedly and contains unwanted data then we can clear/delete them (not the useful one). Or the other option will be to increase the HDFS disk space.
Created 06-26-2018 09:32 AM
@Jay , what is the procedure to increase the HDFS disk space , ?