04-20-2016 04:38 AM
I am testing CDH 5.7.0 on Redhat EL 7.2, and I noticed that upon startup of the cluster, the NFS Gateway(s) fail to start. The (STDOUT) log states the following:
Wed Apr 13 08:56:43 EDT 2016
using /usr/java/jdk1.8.0_77 as JAVA_HOME
using 5 as CDH_VERSION
using /run/cloudera-scm-agent/process/295-hdfs-NFSGATEWAY as CONF_DIR
using as SECURE_USER
using as SECURE_GROUP
Cannot connect to port 111.
No portmap or rpcbind service is running on this host. Please start portmap or rpcbind service before attempting to start the NFS Gateway role on this host.
and in the Role Log File, I see the following errors:
Apr 12, 2:16:13.088 PM ERROR org.apache.hadoop.oncrpc.RpcProgram Unregistration failure with localhost:4242, portmap entry: (PortmapMapping-100005:1:17:4242)
Apr 12, 2:16:13.103 PM ERROR org.apache.hadoop.oncrpc.RpcProgram Unregistration failure with localhost:2049, portmap entry: (PortmapMapping-100003:3:6:2049)
The NFS Gateway will refuse to start until I manually start rpcbind from the commandline (hint taken from above message). I have never had to do this in previous CDH distros, nor RHEL distros; however, in the CDH 5.7.0 / RHEL 7.2 combination, it consistently fails. Anyone else experiencing this?
04-21-2016 10:38 AM
Yes, I realize that will work, and I mentioned that in my original post:
"The NFS Gateway will refuse to start until I manually start rpcbind from the commandline..."
However, I also mentioned that I never had to manually start rpcbind in previous CDH distros on previous versions of Redhat prior to the NFS Gateway starting. In fact, CDH 5.7.0 NFS Gateway starts fine on Redhat 6.5 *without* having to manually start rpcbind first.
I was just wondering if manually starting rpcbind at the commandline is an intended step in CDH 5.7.0 on RHEL 7.2, or if it's a workaround to an issue that needs attention.
04-21-2016 10:47 AM
07-27-2016 03:24 AM
I'm having the same issue, and after checking the systems and the documentation, I've come to the conclusion that the cdh5.7 rpm package for nfs gateway does not support systemd properly, so rpcbind was never called as a "dependent" service when starting nfs gateway. That is the main reason why it worked before.
Interestingly enough, even though rpcbind is in startup, it will actually start only when started manually.
It seems that if you use parcels for installing, they implemented a solution for this problem (judging by Cloudera documentation).
I'll post more information here as I test out different solutions for this problem.