Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase Region Servers Startup Failed after Namenode HA

avatar
Contributor

After Namenode HA, 2 out of my 3 Region Servers in HBase are not coming up. I looked at the logs and found that it is throwing unknown host exception for name service.

2018-05-24 08:48:29,551 INFO  [regionserver/atlhashdn02.hashmap.net/192.166.4.37:16020] regionserver.HRegionServer: STOPPED: Failed initialization
2018-05-24 08:48:29,552 ERROR [regionserver/atlhashdn02.hashmap.net/192.166.4.37:16020] regionserver.HRegionServer: Failed init
java.lang.IllegalArgumentException: java.net.UnknownHostException: clusterha
        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411)
        at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
        at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:688)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:629)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2761)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2795)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2777)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:179)
        at org.apache.hadoop.hbase.wal.DefaultWALProvider.init(DefaultWALProvider.java:97)
        at org.apache.hadoop.hbase.wal.WALFactory.getProvider(WALFactory.java:148)
        at org.apache.hadoop.hbase.wal.WALFactory.<init>(WALFactory.java:180)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1648)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1381)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:917)
        at java.lang.Thread.run(Thread.java:745)
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Looks like Hadoop configurations was missing under classpath and it is not able to detect the nameservice.

Can you check hbase config directory on working region server and non-working region server whether there are missing core-site.xml or hdfs-site.xml files?

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

Looks like Hadoop configurations was missing under classpath and it is not able to detect the nameservice.

Can you check hbase config directory on working region server and non-working region server whether there are missing core-site.xml or hdfs-site.xml files?

avatar
Contributor

@schhabra

I checked hbase-site.xml, hdfs-site.xml and core-site.xml. They are exactly same on both nodes.

avatar
Contributor

Well, the configuration files were correct, but the environment was not set properly. Checked hbase env on both nodes and found a difference. Update with the following properties in ambari and it worked:

export LD_LIBRARY_PATH=::/usr/hdp/2.6.3.0-235/hadoop/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.6.3.0-235/hadoop/lib/native
export HADOOP_HOME=/usr/hdp/2.6.3.0-235/hadoop
export HADOOP_CONF_DIR=/usr/hdp/2.6.3.0-235/hadoop/etc/hadoop