Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

More files and dirs than blocks

avatar
New Member

Hello everyone,

From what I understood every files use a minimum of 1 block in HDFS.

So how can it be possible to have more files than blocks?

Below is an exemple:

$ hdfs fsck /app-logs
 Total size:    14663835874 B
 Total dirs:    2730
 Total files:   8694
 Total symlinks:                0
 Total blocks (validated):      8690 (avg. block size 1687437 B)
 Minimally replicated blocks:   8690 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          4
 Number of racks:               1
FSCK ended at Mon Sep 04 19:54:45 CEST 2017 in 353 milliseconds




The filesystem under path '/app-logs' is HEALTHY


I appreciate any help in understanding this ouput.

Regards

1 ACCEPTED SOLUTION

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

bq. From what I understood every files use a minimum of 1 block in HDFS.

No. It is not true. You can have file of size 0, they are likely to be created on NN but no block being allocated and/or streamed to DNs yet. Check your files and you will have 4 files (8694-8690=4) of size 0.

avatar
New Member

Thank you @Xiaoyu Yao.