On 2 nodes in a cluster of 13 nodes the cloudera program /usr/lib64/cmf/agent/build/env/bin/flood is using quit a lot of cpu and doing more then 200000 context switches by secondes. it is normal ?
Very small ressource usages on all my other nodes....
I'm not sure how YARN could relate to Cloudera Manager Agents' flood application.
"flood" is a service run by each agent that will serve up or download parcels. It is not associated with YARN or even CDH itself for that matter.
If its CPU usage is high, it must be up to something, so we need to figure out what that is. Let's start with the logs:
The "cloudera-flood.log" is likely to have the most relevant information.
I would recommend comparing an agent that has low CPU use by flood and the others to see if there are some obvious differences in the logs.
If the logs don't seem to indicate anything special, we may need to turn to more tools like strace or pstack.
I might suggest getting a few consecutive pstacks to get some specifics about what flood is doing.
The stats alone from the strace don't tell me much other than selects are happening more often on the crazy host. The "select" call user time may indicate that looking at what file descriptors those selects are on would could be interesting.
Perhaps you can check the strace output to find out what is being selected.