Reply
Highlighted
Expert Contributor
Posts: 244
Registered: ‎01-25-2017

HDFS storage keep growing

Hi All,

 

I see my HDFS storage just growing UP, when i checked the hdfs dfs -du and hdfs dfadmin -report both show different results, i suspected it's may related to a HDFS snapshot, but when i checked all the snapshottable dirs and checked the size all seems fine and ensure no old snapshot there that doesn't been updated.

 

Any idea what else can cause this.

 

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/server_psanalytics/dwh_le
7.0 T 21.0 T /liveperson/data/server_psanalytics
hdfs dfs -du -h -s /liveperson/data/storage_Shared/data_Platform/dwh_le/.snapshot/s0
7.0 T 21.0 T /liveperson/data/server_psanalytics/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_psanalytics
21.5 T 64.6 T /liveperson/data/remote/DC=VA/server_psanalytics
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_psanalytics/.snapshot/s0
21.5 T 64.6 T /liveperson/data/remote/DC=VA/server_psanalytics/.snapshot/s0


[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_dataaccess_le/finalDir
1.5 T 4.4 T /liveperson/data/remote/DC=VA/server_dataaccess_le/finalDir
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_dataaccess_le/finalDir/.snapshot/s0
1.5 T 4.4 T /liveperson/data/remote/DC=VA/server_dataaccess_le/finalDir/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_live-engage-mr/output
5.2 T 15.5 T /liveperson/data/remote/DC=VA/server_live-engage-mr/output
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/server_live-engage-mr/output/.snapshot/s0
5.2 T 15.5 T /liveperson/data/remote/DC=VA/server_live-engage-mr/output/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le
105.7 T 315.9 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le/.snapshot/s0
105.7 T 315.9 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/AgentSession
308.1 G 924.3 G /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/AgentSession
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/AgentSession/.snapshot/s0
308.1 G 924.3 G /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/AgentSession/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/DataViews
46.5 T 139.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/DataViews
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/DataViews/.snapshot/s0
46.5 T 139.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/DataViews/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/EngagementSession
10.5 T 31.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/EngagementSession
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/EngagementSession/.snapshot/s0
10.5 T 31.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/EngagementSession/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/SurveySession
497.3 G 1.5 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/SurveySession
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/SurveySession/.snapshot/s0
497.3 G 1.5 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/SurveySession/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/VisitorSession
41.8 T 125.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/VisitorSession
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/VisitorSession/.snapshot/s0
41.8 T 125.4 T /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dallas/output/VisitorSession/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/server_live-engage-mr/output
169.6 G 508.8 G /liveperson/data/server_live-engage-mr/output
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/server_live-engage-mr/output/.snapshot/s0
169.6 G 508.7 G /liveperson/data/server_live-engage-mr/output/.snapshot/s0

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/storage_Shared/data_Platform/dwh_le
426.4 G 1.2 T /liveperson/data/storage_Shared/data_Platform/dwh_le
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/storage_Shared/data_Platform/dwh_le/.snapshot/daily
426.4 G 1.2 T /liveperson/data/storage_Shared/data_Platform/dwh_le/.snapshot/daily
[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /liveperson/data/storage_Shared/data_Platform/dwh_le/.snapshot/hourly
426.4 G 1.2 T /liveperson/data/storage_Shared/data_Platform/dwh_le/.snapshot/hourly

=================================================

[cloudera-scm@roor-chc101 root]$ hdfs dfs -du -h -s /
254.1 T 760.8 T /

[cloudera-scm@roor-chc101 root]$ hdfs dfsadmin -report
Configured Capacity: 1593782659690496 (1.42 PB)
Present Capacity: 1592652339410792 (1.41 PB)
DFS Remaining: 590706354708072 (537.24 TB)
DFS Used: 1001945984702720 (911.26 TB)
DFS Used%: 62.91%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

 

 

 

 


====================================================

 

Expert Contributor
Posts: 244
Registered: ‎01-25-2017

Re: HDFS storage keep growing

Anyone who faced the same issue or have an idea what can cause this?

Expert Contributor
Posts: 244
Registered: ‎01-25-2017

Re: HDFS storage keep growing

i deleted all the snapshotes and disallow snapshottable dir but didn't resolve the issue.

 

Here is the hdfs fsck which i don't see even if i have over replciated:

 

Total size: 326946225713928 B (Total open files size: 939569189 B)
Total dirs: 1926860
Total files: 12972778
Total symlinks: 0 (Files currently being written: 50)
Total blocks (validated): 13863562 (avg. block size 23583132 B) (Total open file blocks (not validated): 18)
Minimally replicated blocks: 13863562 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.9942603
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 45
Number of racks: 1
FSCK ended at Thu Aug 17 04:29:08 EDT 2017 in 370947 milliseconds

 

 

Expert Contributor
Posts: 244
Registered: ‎01-25-2017

Re: HDFS storage keep growing

I figured out the issue.

 

The diffetence comes from /tmp/logs.

 

Weird why hdfs dfs -du -h -s / is not considering /tmp/logs.

Announcements