Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

The DataNode is not connected to one or more of its NameNode(s)

Highlighted

The DataNode is not connected to one or more of its NameNode(s)

New Contributor

Hi,

 

I have 10 datanodes in my cluster and 1 of the datanodes health is bad. I see the below error for this node.

 

'This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor.' 

 

I have restarted the scm agent but it still has the same error. Below are the error messages in the clouders-scm-agent.log file. Can someone please help me to fix this. 

 

[04/Apr/2019 12:52:26 +0000] 122842 MainThread agent        ERROR    Failed to configure inotify. Parcel repository will not auto-refresh.

Traceback (most recent call last):

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 813, in _init_after_first_heartbeat_response

    self.inotify = self.repo.configure_inotify()

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/parcel.py", line 415, in configure_inotify

    wm = pyinotify.WatchManager()

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/pyinotify-0.9.3-py2.7.egg/pyinotify.py", line 1706, in __init__

    raise OSError(err % self._inotify_wrapper.str_errno())

OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)

[04/Apr/2019 12:52:26 +0000] 122842 MainThread parcel_cache INFO     Using /opt/cloudera/parcel-cache for parcel cache

[04/Apr/2019 12:52:26 +0000] 122842 MainThread agent        ERROR    Caught unexpected exception in main loop.

Traceback (most recent call last):

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 710, in start

    self._init_after_first_heartbeat_response(resp_data)

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 840, in _init_after_first_heartbeat_response

    self.client_configs.load()

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 682, in load

    new_deployed.update(self._lookup_alternatives(fname))

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 432, in _lookup_alternatives

    return self._parse_alternatives(alt_name, out)

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 444, in _parse_alternatives

    path, _, _, priority_str = line.rstrip().split(" ")

ValueError: too many values to unpack

 

Thanks

5 REPLIES 5

Re: The DataNode is not connected to one or more of its NameNode(s)

Expert Contributor

Hi @Subba,

 

Have you try to restart all components?

 

See this solution:

https://community.cloudera.com/t5/Cloudera-Manager-Installation/This-host-is-not-in-contact-with-the...

 

 

Regards,

Manu.

Re: The DataNode is not connected to one or more of its NameNode(s)

Expert Contributor

The relevant error is

OSError: Cannot initialize new instance of inotify, Errno=Too many open files (EMFILE)

Have you tried to hard restart the CM agent? Do you see those errors straight after startup?

Can you count the number of files in /var/log/spark/lineage directory, there is a known issue when there is a huge number of files in there. 

Re: The DataNode is not connected to one or more of its NameNode(s)

New Contributor

Hi,

 

Thanks for the reply. I dont find spark folder in /var/log. I'm getting this issue evan after restarting the scm-agent.

 

 

 File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 710, in start

    self._init_after_first_heartbeat_response(resp_data)

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/agent.py", line 840, in _init_after_first_heartbeat_response

    self.client_configs.load()

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 682, in load

    new_deployed.update(self._lookup_alternatives(fname))

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 432, in _lookup_alternatives

    return self._parse_alternatives(alt_name, out)

  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.8.2-py2.7.egg/cmf/client_configs.py", line 444, in _parse_alternatives

    path, _, _, priority_str = line.rstrip().split(" ")

ValueError: too many values to unpack

 

 

I tried to see the jdk versions. But same jdk is present in all other servers. Not sure why am having issue only with this server.

 

[root@ip-10-0-1-32 ~]# rpm -qa "*jdk*"

java-1.8.0-openjdk-devel-1.8.0.144-0.b01.el7_4.x86_64

java-1.8.0-openjdk-headless-1.8.0.144-0.b01.el7_4.x86_64

jdk1.8.0_102-1.8.0_102-fcs.x86_64

copy-jdk-configs-2.2-3.el7.noarch

java-1.8.0-openjdk-1.8.0.144-0.b01.el7_4.x86_64

 

 

Thanks

Re: The DataNode is not connected to one or more of its NameNode(s)

Expert Contributor

The issue is caused by a new option introduced to alternatives in this version which causes the alternatives parser to fail.

 

For resolution the options are to either

  1. Remove all OpenJDK packages from this host, they are not needed for CM / CDH 
  2. Apply the proposed workaround from this thread post (which you already posted to)
  3. Upgrade Cloudera Manager to 5.12.1 or later, it has code changes to deal with this situation

Re: The DataNode is not connected to one or more of its NameNode(s)

New Contributor

We encountered the same issue while adding 1 of the nodes and we had a KB article in Cloudera:

ClouderaKBArticle

 

To fix the issue, change the client_configs.py file in the path (adjust cmf-5.X.Y-py2.7.egg below to the installed agent version):

/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.X.Y-py2.7.egg/cmf/client_configs.py

1) Make a backup of the above file.

2) Using Unix "vi" editor, open the above file and search for "line.rstrip" to come to the line starting with "path, _"

3) comment that line

4) Insert the lines:
#proposed fix for OPSAPS-38086
   thisLine = line.rstrip().split(" ")
   path = thisLine[0]
   priority_str = thisLine[-1]

5) Verify that the indentation is exact(line below line) as python requires it.  The end result should look like this

The final change looks like below:

#path, _, _, priority_str = line.rstrip().split(" ")
#proposed fix for OPSAPS-38086
thisLine = line.rstrip().split(" ")
path = thisLine[0]
priority_str = thisLine[-1]


6) Save the file.

7) Restart agent and verify that Host health is now showing good.