Support Questions

Find answers, ask questions, and share your expertise

One datanode nearly full but not the others

avatar
Contributor

Hi all,

I have 6 datanodes on my hortonworks cluster (HDP 2.6.3) and one of them is 91% full. The others are "only" 65% full.

Il don't understand why the replication is not homogeneous and how i can fix it ?

107748-datanodes.gif


I check the file system and the same difference is observed :

On a safe node :

# pwd
/grid1/hadoop/hdfs/data/current/BP-332877091-10.136.82.11-1500650625087/current/finalized
# du -h . --summarize
1.9T    .

On the unsafe node :

# pwd
/grid1/hadoop/hdfs/data/current/BP-332877091-10.136.82.11-1500650625087/current/finalized
# du -h . --summarize
2.7T    .

Same things on each DataNode directories.


Thanks for your help.

Mathieu


1 ACCEPTED SOLUTION

avatar
2 REPLIES 2

avatar

Not sure how you got into this shape, but the balancer can fix it. https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer

avatar
Contributor

Thanks @Lester Martin

I keep in mind the balancer admin command.

I solve the issue simply by removing a very huge file created by a data scientist executing a very huge request on hive. The temporary files located at /tmp/hive/[user] seems to be not replicated (i'am not sure of that).