Created on 04-12-2019 08:34 AM - edited 08-17-2019 04:03 PM
Hi all,
I have 6 datanodes on my hortonworks cluster (HDP 2.6.3) and one of them is 91% full. The others are "only" 65% full.
Il don't understand why the replication is not homogeneous and how i can fix it ?
I check the file system and the same difference is observed :
On a safe node :
# pwd /grid1/hadoop/hdfs/data/current/BP-332877091-10.136.82.11-1500650625087/current/finalized # du -h . --summarize 1.9T .
On the unsafe node :
# pwd /grid1/hadoop/hdfs/data/current/BP-332877091-10.136.82.11-1500650625087/current/finalized # du -h . --summarize 2.7T .
Same things on each DataNode directories.
Thanks for your help.
Mathieu
Created 04-14-2019 12:06 AM
Not sure how you got into this shape, but the balancer can fix it. https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer
Created 04-14-2019 12:06 AM
Not sure how you got into this shape, but the balancer can fix it. https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer
Created 04-15-2019 08:32 AM
Thanks @Lester Martin
I keep in mind the balancer admin command.
I solve the issue simply by removing a very huge file created by a data scientist executing a very huge request on hive. The temporary files located at /tmp/hive/[user] seems to be not replicated (i'am not sure of that).