Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Data Nodes displaying incorrect block report

avatar
Contributor

I am getting a strange issue with 3 out of 8 data nodes in our HDP 2.6.0 cluster. These 3 data nodes are not reporting the correct number of blocks and also not sending the block reports to name node on regular intervals.

Ambari reporting :

[Alert][datanode_storage] Unable to extract JSON from JMX response

Any suggestion what is wrong with our cluster?

Thanks in advance for your assistance.


namenode-ui.pngdatanode-ui.pngdata-node-jmx.png
1 ACCEPTED SOLUTION

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
10 REPLIES 10

avatar
Master Mentor

@Samant Thakur

When a Hadoop framework creates a new block, it places the first replica on the local node. And place the second one in a different rack, and the third one is on a different node on the local node. During block replicating, if the number of existing replicas is one, place the second on a different rack. When the number of existing replicas are two, if the two replicas are in the same rack, place the third one on a different rack.

The main purpose of Rack awareness is to:

  • Improve data reliability and data availability.
  • Better cluster performance.
  • Prevents data loss if the entire rack fails.
  • To improve network bandwidth.
  • Keep the bulk flow in-rack when possible.

If your production and this problematic cluster have the same Ambari/HDP version then, you can't call it a bug but client specific problem.

I would still insist you enable rack awareness and monitor over 24hr to see the change in the alerts. Have you tried running a cluster balancing utility?

$ hadoop balancer

HTH