Support Questions
Find answers, ask questions, and share your expertise

Servers are down or not responding when I run TestDFSIO(mapreduce,ambari)

Servers are down or not responding when I run TestDFSIO(mapreduce,ambari)

Explorer

My servers are down and not any responding when I run TestDFSIO the size over 300GB.No any messanges is recorded in ambari's log ,component's log(mapreduce , yarn ) or linux system . I used redhat7.0 ,12 SAS disks (7200) ,memory size is 96GB , 10G network , 6 servers (two namenodes, 4 datanodes) ,CPU:E5-2660v3(2.6GHz/10c)9.6GT/25ML3 * 2 .Yarn's settings are OK.

I use this command

hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.jar -write -nrFiles 300 -size 1024

when job runs about 40~80% , one of 4 datanodes is down and not any responding , then othter three datanodes are down turn by turn in 10 minutes . It 's so strange. I guess it's maybe the problem of 10G network cards , but no change when I replace network card driver and network card .

Any one meets this problems?

1 REPLY 1

Re: Servers are down or not responding when I run TestDFSIO(mapreduce,ambari)

@Mon key, are you saying that the DataNodes are running when you start TestDFSIO, and then they appear to gradually shut down one by one while the job is running? If so, have you tried looking at the DataNode logs to try to determine why they shut down?