Created on 11-13-2017 06:03 AM - edited 09-16-2022 05:30 AM
In the document, Network Errors means Rate of network errors on JVM.
But I am a little bit confused at JVM. Is JVM the container? If so, does this value represent the rate of drop package while connecting containers on the datanodes?
Created 11-13-2017 06:22 AM
Yes, DataNode runs on JVM. So the details that you get from "NETWORK ERRORS / GC COUNT" section is for the JVM on which the DataNode is running.
Basically this reads the "dfs.datanode.DatanodeNetworkErrors" metrics of DataNode. Which is "Count of network errors on the datanode". https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache...
Similarly for GC count it reads the "jvm.JvmMetrics.GcCount" metrics of DataNode.
Created 11-13-2017 07:29 AM
Thanks for your answer. One more question is how dose this value calculate?
Because I have doubt with the the value of vertical axis.
Created 11-13-2017 07:35 AM
It basically check the "DataXceiverServer" errors and then based on that it increments the error counter using the following method when it encounters Read/write data error from/to the DataXceiverServer.
/** * Read/write data from/to the DataXceiverServer. */ @Override public void run() { int opsProcessed = 0; Op op = null; try { dataXceiverServer.addPeer(peer, Thread.currentThread(), this); peer.setWriteTimeout(datanode.getDnConf().socketWriteTimeout); InputStream input = socketIn; try { IOStreamPair saslStreams = datanode.saslServer.receive(peer, socketOut, socketIn, datanode.getXferAddress().getPort(), datanode.getDatanodeId()); input = new BufferedInputStream(saslStreams.in, HdfsConstants.SMALL_BUFFER_SIZE); socketOut = saslStreams.out; } catch (InvalidMagicNumberException imne) { if (imne.isHandshake4Encryption()) { LOG.info("Failed to read expected encryption handshake from client " + "at " + peer.getRemoteAddressString() + ". Perhaps the client " + "is running an older version of Hadoop which does not support " + "encryption"); } else { LOG.info("Failed to read expected SASL data transfer protection " + "handshake from client at " + peer.getRemoteAddressString() + ". Perhaps the client is running an older version of Hadoop " + "which does not support SASL data transfer protection"); } return; } super.initialize(new DataInputStream(input)); // We process requests in a loop, and stay around for a short timeout. // This optimistic behaviour allows the other end to reuse connections. // Setting keepalive timeout to 0 disable this behavior. do { updateCurrentThreadName("Waiting for operation #" + (opsProcessed + 1)); try { if (opsProcessed != 0) { assert dnConf.socketKeepaliveTimeout > 0; peer.setReadTimeout(dnConf.socketKeepaliveTimeout); } else { peer.setReadTimeout(dnConf.socketTimeout); } op = readOp(); } catch (InterruptedIOException ignored) { // Time out while we wait for client rpc break; } catch (IOException err) { // Since we optimistically expect the next op, it's quite normal to get EOF here. if (opsProcessed > 0 && (err instanceof EOFException || err instanceof ClosedChannelException)) { if (LOG.isDebugEnabled()) { LOG.debug("Cached " + peer + " closing after " + opsProcessed + " ops"); } } else { incrDatanodeNetworkErrors(); throw err; } break; }
.
Please Notice:
} else { incrDatanodeNetworkErrors(); throw err; }
Reference Code:
.