Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

Solved Go to solution

Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

New Contributor

Earlier this week I upgraded to CDH 5.4.5 from 5.4.4. Since then I can see that the datanode process is constantly reading 2M/s per drive in each host (with a number of writes maybe one order of magnitude smaller), but there is no corresponding HDFS I/O activity (just the usual activity, ~40k/s) 

It seems as if nobody is actually reading anything, but the HDFS process is doing something on its own.

 

Any ideas about what could be causing this? How could I find out / diagnose what's happening?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

New Contributor

After some more research we found that it wasn't a zombie spark job that was causing the resource usage. It was the HDFS blockscanner. Apparently the default configuration changed with the upgrade and it started running right after we restarted upon upgrading. We had never seen it running before and hence the mistery.

4 REPLIES 4

Re: Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

New Contributor
Nevermind, it was a zombie spark job.

Re: Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

Community Manager

I am happy to see you killed off the zombies. :)

 

zombies.jpg



Cy Jervis, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Re: Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

New Contributor

After some more research we found that it wasn't a zombie spark job that was causing the resource usage. It was the HDFS blockscanner. Apparently the default configuration changed with the upgrade and it started running right after we restarted upon upgrading. We had never seen it running before and hence the mistery.

Re: Datanodes are reading all the time without corresponding HDFS I/O activity after 5.4.5 upgrade

Community Manager

That makes sense but totally invalidates my Zombie sign. :)

 

Feel free to mark your last comment as the solution in case it can help others in the future. 



Cy Jervis, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:
Community Guidelines
How to use the forum