Archives of Support Questions (Read Only)

ScottChris · ‎05-18-2018

CDH 5.13.1
Redhat 6.9

We wish to change the number of replications from the default of 3 copies to 2 on one particular folder in hdfs.

After running this on one cluster:

$ hdfs dfs -setrep -R 2 /backups

and then doing a

$ hdfs dfs -du /

we saw that it freed the blocks very quickly and the output of fsck shows no "Over-replicated blocks":

Status: HEALTHY
Total size:   149514016589 B
Total dirs:   27440
Total files:   128746
Total symlinks:       0
Total blocks (validated):   126355 (avg. block size 1183285 B)
Minimally replicated blocks:   126355 (100.0 %)
Over-replicated blocks:   0 (0.0 %)
Under-replicated blocks:   0 (0.0 %)
Mis-replicated blocks:       0 (0.0 %)
Default replication factor:   3
Average block replication:   2.3367577
Corrupt blocks:       0
Missing replicas:       0 (0.0 %)
Number of data-nodes:       3
Number of racks:       1

However on a bigger test system did the same command and even a day later still no change.

$ hdfs fsck /

shows "Over-replicated blocks"

Status: HEALTHY
Total size:   56614841380 B
Total dirs:   7222
Total files:   113731
Total symlinks:       0
Total blocks (validated):   110143 (avg. block size 514012 B)
Minimally replicated blocks:   110143 (100.0 %)
Over-replicated blocks:   37439 (33.991264 %)
Under-replicated blocks:   0 (0.0 %)
Mis-replicated blocks:       0 (0.0 %)
Default replication factor:   3
Average block replication:   2.9921465
Corrupt blocks:       0
Missing replicas:       0 (0.0 %)
Number of data-nodes:       8
Number of racks:       1

The number of Over-replicated blocks has reduced slightly and seems stuck at 37439.

I've manually restarted each datanode service, and later restarted the entire cluster.

Still stuck at 37439.

I found this comment from Jarsh J:

|Then monitor the over-replicated blocks in Cloudera Manager via the below chart tsquery:
|
|SELECT excess_blocks WHERE roleType = NAMENODE
|
|This should show a spike and then begin a slow but steady drop back to zero over time, which you can monitor.

but when I run this query it reports "excess_blocks" is 0.

$ hdfs dfs -du /
22987202359 69376013863 /backups

shows 3 copies still.

How to get this data space cleared?

Rebalance did nothing.

thanks.

Harsh J · ‎05-20-2018

Do you perchance have any snapshots held from before the 'hdfs dfs -setrep 2' command was executed, under the target path (/backups)?

If you do have a snapshot, and the over replicated count is still stuck, this behaviour can be explained, because replication factor is a file based attribute and the older snapshot references the higher replication factor, disallowing the deletion of the now-excess block.

You can run the below to discover existing snapshots, as the 'hdfs' user (or equivalent superuser):

~> hdfs lsSnapshottableDir
~> # For every directory printed above as $DIR:
~> hdfs dfs -ls $DIR/.snapshot/

View solution in original post

Harsh J · ‎05-20-2018

Do you perchance have any snapshots held from before the 'hdfs dfs -setrep 2' command was executed, under the target path (/backups)?

If you do have a snapshot, and the over replicated count is still stuck, this behaviour can be explained, because replication factor is a file based attribute and the older snapshot references the higher replication factor, disallowing the deletion of the now-excess block.

You can run the below to discover existing snapshots, as the 'hdfs' user (or equivalent superuser):

~> hdfs lsSnapshottableDir
~> # For every directory printed above as $DIR:
~> hdfs dfs -ls $DIR/.snapshot/

ScottChris · ‎05-21-2018

Yes!

There was a snapshot.

Thank you!

prabhat10 · ‎07-16-2018

@Harsh J

@ScottChris

Could you kindly explain this in a bit detail.

You can run the below to discover existing snapshots, as the 'hdfs' user (or equivalent superuser):

~> hdfs lsSnapshottableDir
~> # For every directory printed above as $DIR:
~> hdfs dfs -ls $DIR/.snapshot/

After this step what we need to do. Do we need to delete older snapshot (which was created when rep is 3) and create a new snapshot at this time when rep is 2.

Thanks in advance.

Harsh J · ‎07-16-2018

> Do we need to delete older snapshot (which was created when rep is 3) and
create a new snapshot at this time when rep is 2.

Yes, that is correct.

prabhat10 · ‎07-16-2018

@Harsh J

Thanks for reply.

Cloudera Community

Archives of Support Questions (Read Only)

after -setrep from 3 to 2, Over-replicated blocks are not being freed after 24 hours