Support Questions

Find answers, ask questions, and share your expertise

More files and dirs than blocks

avatar
Contributor

Hello everyone,

From what I understood every files use a minimum of 1 block in HDFS.

So how can it be possible to have more files than blocks?

Below is an exemple:

$ hdfs fsck /app-logs
 Total size:    14663835874 B
 Total dirs:    2730
 Total files:   8694
 Total symlinks:                0
 Total blocks (validated):      8690 (avg. block size 1687437 B)
 Minimally replicated blocks:   8690 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          4
 Number of racks:               1
FSCK ended at Mon Sep 04 19:54:45 CEST 2017 in 353 milliseconds




The filesystem under path '/app-logs' is HEALTHY


I appreciate any help in understanding this ouput.

Regards

1 ACCEPTED SOLUTION

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

avatar
Contributor

Thank you @Xiaoyu Yao.