we have ambari cluster with:
872 data-nodes machines ,
11 kafka machines
3 masters machines ( while namenodes are on master1 and master3 )
ambari version is 2.6.x , and HDP version - 2.6.4
for now we have for now some network problem ,
after long investigation we found that , ambari agent that runs on some machine not communicate well with the ambari server
therefore we get some strange behaviors as 5 dead data-nodes from ambari dashboard ( DataNodes Status ) , while for sure datanodes machine are healthy
is it possible to give more tolerated value in ambari agent configuration so the ack between ambari agent to ambari server will be after more little time in order to ignore the network problems ?
something like timeout or time connection between the ambari agent to ambari server?