Created on 09-12-2018 11:24 AM - edited 09-12-2018 01:39 PM
No configuration changed when I started getting
Datanode is not connected to one or more of its Namenode.
Also, I start getting web server status error that Cloudera agent is not getting a response from its web server role.
This is what the log looks like:
dwh-worker-4.c.abc-1225.internal ERROR September 12, 2018 5:33 PM DataNode
dwh-worker-4.c.abc-1225.internal:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.31.10.74:44280 dst: /172.31.10.74:50010
java.io.IOException: Not ready to serve the block pool, BP-1423177047-172.31.4.192-1492091038346.
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
also, the data nodes are randomly exiting:
Created on 12-17-2018 07:11 PM - edited 12-17-2018 07:12 PM
the same to me,have you solved this?
```
Version: Cloudera Express 5.14.0 (#25 built by jenkins on 20180118-0523 git: 9c3cd91a00702a8eb6b26af897c01e9e795e4c2b)
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java VM Vendor: Oracle Corporation
Java Version: 1.8.0_151
```
the first time find this error is DataNode trigger full gc, we change the datanode gc policy to G1,heap size to 4G, after never find full gc
But restart DataNode too slowly, wait one hours can connect to NameNode
```
This DataNode is not connected to one or more of its NameNode(s).
```
and datanode always print
```
2018-12-18 09:49:20,864 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time taken to scan block pool BP-422077022-10.9.4.93-1530259053126 on /alidata2/dfs/dn/current: 341988ms
````
and not connected to NameNode come up frequently
datanode error:
```
2018-12-18 11:09:08,221 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: hdpdn03.ebanma.com:50010:DataXceiver error processing WRITE_BLOCK operation src: /10.9.4.72:43176 dst: /10.9.4.71:50010
java.io.IOException: Not ready to serve the block pool, BP-1768979294-10.9.4.68-1526626190229.
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAndWaitForBP(DataXceiver.java:1290)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.checkAccess(DataXceiver.java:1298)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:630)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:748)
```
how to solve this? Thanks
Created 12-21-2022 01:37 PM
I am also getting same issue like multiple datanode showing NN connectivity issue with same error logs.
Did you got the solution for this issue, please share.
Created 12-23-2022 09:13 PM
@sim6 @Shahezad @jimmy-src Try to restart your OS system.
If you face this issue, frequently please effect the below change and restart as needed, which suits better for large disks/dn's. CM > HDFS > Configurations > "Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml" Name: fs.getspaceused.classname Value: org.apache.hadoop.fs.DFCachingGetSpaceUsed