Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

More files and dirs than blocks

avatar
Contributor

Hello everyone,

From what I understood every files use a minimum of 1 block in HDFS.

So how can it be possible to have more files than blocks?

Below is an exemple:

$ hdfs fsck /app-logs
 Total size:    14663835874 B
 Total dirs:    2730
 Total files:   8694
 Total symlinks:                0
 Total blocks (validated):      8690 (avg. block size 1687437 B)
 Minimally replicated blocks:   8690 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          4
 Number of racks:               1
FSCK ended at Mon Sep 04 19:54:45 CEST 2017 in 353 milliseconds




The filesystem under path '/app-logs' is HEALTHY


I appreciate any help in understanding this ouput.

Regards

1 ACCEPTED SOLUTION

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

avatar
Contributor

Thank you @Xiaoyu Yao.