Created 09-19-2018 12:57 PM
i newly installed HDP 3.0 on my computer,before i was using HDP 2.6.x it was working fine.But after installing HDP3.0, a newly introduced component YARN DNS registry server not starting showing address already binded exception.
i tried it in two computers seperately with fresh insatllation of Ubuntu 16.04 and HDP3.0 on both the machines it is showing same error.
sufred a lot but no hope.there is no proper solution available in internet.
appreiciated your help on this.
Thanks in advanced
Created 09-20-2018 06:35 AM
Thats a port confict issue that is new with HDP 3.0, the incriminating port is 53 for the DNS listener.IThe only solution I have so far found is locating the process occupying the port and killing the PID /var/log/hadoop-yarn/yarn/privileged-root-registrydns-FQDN.err
# sudo lsof -i -P -n
Then
# kill -9 <PID>
Now you can start your registry-dns from Ambari.
Here is a documentation to explain that look especially at the entry Configure Registry DNS
Created 07-23-2019 06:44 AM
For others, here is an example of my terminal input when solving this problem:
[root@HW01 ~]# cat /var/log/hadoop-yarn/yarn/privileged-root-registrydns-HW01.co.local.err
java.net.BindException: Problem binding to [HW01.co.local:53] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:736)
at org.apache.hadoop.registry.server.dns.RegistryDNS.openUDPChannel(RegistryDNS.java:1016)
at org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOUDP(RegistryDNS.java:925)
at org.apache.hadoop.registry.server.dns.RegistryDNS.initializeChannels(RegistryDNS.java:196)
at org.apache.hadoop.registry.server.dns.PrivilegedRegistryDNSStarter.init(PrivilegedRegistryDNSStarter.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:691)
at sun.nio.ch.DatagramSocketAdaptor.bind(DatagramSocketAdaptor.java:91)
at org.apache.hadoop.registry.server.dns.RegistryDNS.openUDPChannel(RegistryDNS.java:1014)
... 8 more
Cannot load daemon
Service exit with a return value of 3
[root@HW01 ~]# lsof -i -P -n | grep ':53'
avahi-dae 2140 avahi 12u IPv4 25507 0t0 UDP *:5353
dnsmasq 3080 nobody 5u IPv4 35043 0t0 UDP 192.168.122.1:53
dnsmasq 3080 nobody 6u IPv4 35044 0t0 TCP 192.168.122.1:53 (LISTEN)
java 118018 infra-solr 229u IPv6 31471850 0t0 TCP 172.18.4.46:8886->172.18.4.46:53816 (ESTABLISHED)
java 118018 infra-solr 241u IPv6 31471130 0t0 TCP 172.18.4.46:53816->172.18.4.46:8886 (ESTABLISHED)
java 134005 accumulo 318u IPv4 31459695 0t0 TCP 172.18.4.46:12234->172.18.4.48:53562 (ESTABLISHED)
java 135984 druid 648u IPv4 31594196 0t0 UDP 172.18.4.46:38510->172.18.4.12:53
java 135984 druid 649u IPv4 31594289 0t0 UDP 172.18.4.46:57316->172.18.4.11:53
java 138702 oozie 799u IPv6 31451570 0t0 TCP 172.18.4.46:53096->172.18.4.48:3306 (ESTABLISHED)
java 138702 oozie 800u IPv6 31454483 0t0 TCP 172.18.4.46:53100->172.18.4.48:3306 (ESTABLISHED)
[root@HW01 ~]# kill -9 3080
[root@HW01 ~]# lsof -i -P -n | grep ':53'
avahi-dae 2140 avahi 12u IPv4 25507 0t0 UDP *:5353
java 118018 infra-solr 229u IPv6 31471850 0t0 TCP 172.18.4.46:8886->172.18.4.46:53816 (ESTABLISHED)
java 118018 infra-solr 241u IPv6 31471130 0t0 TCP 172.18.4.46:53816->172.18.4.46:8886 (ESTABLISHED)
java 134005 accumulo 318u IPv4 31459695 0t0 TCP 172.18.4.46:12234->172.18.4.48:53562 (ESTABLISHED)
java 138702 oozie 799u IPv6 31451570 0t0 TCP 172.18.4.46:53096->172.18.4.48:3306 (ESTABLISHED)
java 138702 oozie 800u IPv6 31454483 0t0 TCP 172.18.4.46:53100->172.18.4.48:3306 (ESTABLISHED)
Created on 11-01-2019 05:47 PM - edited 11-01-2019 05:58 PM
i tried your solution and it works ,
so for the newbies :
1 - go on your terminal and check the log file :
ones your are inside yarn file , display it contents using ll command :
[root@osboxes osboxes]# cd /var/log/hadoop-yarn/
[root@osboxes hadoop-yarn]# ll
total 8
drwxr-xr-x. 2 yarn-ats hadoop 4096 1 nov. 19:38 embedded-yarn-ats-hbase
drwxr-xr-x. 3 root root 28 1 nov. 18:33 nodemanager
drwxr-xr-x. 2 yarn hadoop 4096 1 nov. 19:58 yarn
[root@osboxes hadoop-yarn]# cd yarn/
[root@osboxes yarn]# ll
inside the log file there is no thing to do , just make sure you got this error logged in (privileged-root-registrydns-osboxes.err)
2- on your terminal do this :
sudo lsof -i -P -n
a list of processes will be shown
look for the process's command name dnsmasq , this service should uses the 53 port
for my case :
dnsmasq 1851 nobody 3u IPv4 24611 0t0 UDP *:67
dnsmasq 1851 nobody 5u IPv4 24614 0t0 UDP 192.168.122.1:53
dnsmasq 1851 nobody 6u IPv4 24615 0t0 TCP 192.168.122.1:53 (LISTEN)
so the PID (process ID) of the command is 1851
just kill it by writing : kill -9 XXXX
for my case is : kill -9 1851
Created 09-21-2018 07:31 AM
Any update on this please accept the answer and to close the thread
Created 09-21-2018 10:56 AM
still it is failing, i am keep on trying. i will let you know once it is done
Created 11-02-2019 12:28 AM
Did your problem get resolved? I provided an answer and maybe tagged wrongly.Please let me know.