One datanode nearly full but not the others

Hi all,

I have 6 datanodes on my hortonworks cluster (HDP 2.6.3) and one of them is 91% full. The others are "only" 65% full.

Il don't understand why the replication is not homogeneous and how i can fix it ?


I check the file system and the same difference is observed :

On a safe node :

# pwd
# du -h . --summarize
1.9T    .

On the unsafe node :

# pwd
# du -h . --summarize
2.7T    .

Same things on each DataNode directories.

Thanks for your help.




Not sure how you got into this shape, but the balancer can fix it.

Thanks @Lester Martin

I keep in mind the balancer admin command.

I solve the issue simply by removing a very huge file created by a data scientist executing a very huge request on hive. The temporary files located at /tmp/hive/[user] seems to be not replicated (i'am not sure of that).

