Member since
03-19-2014
7
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3845 | 01-15-2015 06:18 PM |
01-15-2015
06:18 PM
Thanks for pointing me in the right direction on this. I had 2 issues: 1. My configs for zookeeper were incorrect (they did not specifically list out the zookeeper hosts) so the ha configuration for the namenodes wasn't working. 2. The file /etc/hadoop/conf/topology.py wasn't executable. Once I fixed those two things it started working fine. I found the executable issue via the namenode logs you pointed out: ... java.io.IOException: Cannot run program "/etc/hadoop/conf/topology.py" (in directory "/usr/lib/hadoop"): error=13, Permission denied ... Out of curiosity - is rack awareness required for hbase & yarn?
... View more
01-09-2015
02:55 PM
We don't have kerberos running, and we're running the HMaster on the same node as the primary hdfs name node (we have ha setup, so 2 name nodes). There shouldn't be any firewalls in the mix, but we did find that the data nodes were in a different vlan than the name nodes, so we're in the process of moving them to the same name node to see if there is some traffic that is being blocked. I'll check the NN logs - thanks for the tip. If things start working after moving to the same vlan I'll let you know.
... View more
01-05-2015
10:53 AM
Just setup a new cluster (CDH4.7) and the hbase master won't start due to an unhandled NullPointerException. I've looked at various things (DNS, configs, etc), but I'm unable to figureout what is wrong, and I'm hoping someone might have an idea? Here's the exception: 2015-01-03 00:01:47,754 INFO org.apache.hadoop.hbase.master.SplitLogManager: timeout = 300000 2015-01-03 00:01:47,754 INFO org.apache.hadoop.hbase.master.SplitLogManager: unassigned timeout = 180000 2015-01-03 00:01:47,754 INFO org.apache.hadoop.hbase.master.SplitLogManager: resubmit threshold = 3 2015-01-03 00:01:47,762 INFO org.apache.hadoop.hbase.master.SplitLogManager: found 0 orphan tasks and 0 rescan nodes 2015-01-03 00:01:47,902 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:329) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1409) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:413) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:172) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44938) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) at org.apache.hadoop.ipc.Client.call(Client.java:1238) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:970) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:960) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:239) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:206) at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:199) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1117) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:249) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:748) at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:286) at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:327) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:444) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:148) at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:133) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:572) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:432) at java.lang.Thread.run(Thread.java:724) 2015-01-03 00:01:47,905 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2015-01-03 00:01:47,905 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads 2015-01-03 00:01:47,905 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 60000 Config is straight foward: <?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera Manager--> <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://moz-prod/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.client.write.buffer</name> <value>2097152</value> </property> <property> <name>hbase.client.pause</name> <value>100</value> </property> <property> <name>hbase.client.retries.number</name> <value>35</value> </property> <property> <name>hbase.client.scanner.caching</name> <value>100</value> </property> <property> <name>hbase.client.keyvalue.maxsize</name> <value>10485760</value> </property> <property> <name>hbase.rpc.timeout</name> <value>60000</value> </property> <property> <name>hbase.snapshot.enabled</name> <value>true</value> </property> <property> <name>hbase.security.authentication</name> <value>simple</value> </property> <property> <name>zookeeper.session.timeout</name> <value>60000</value> </property> <property> <name>zookeeper.znode.parent</name> <value>/hbase</value> </property> <property> <name>zookeeper.znode.rootserver</name> <value>root-region-server</value> </property> <!--Auto Failover Configuration (zookeeper)--> <property> <name>hbase.zookeeper.quorum</name> <value>dalmozhadoop1.dal.moz.com:2181,dalmozhadoop2.dal.moz.com:2181,dalmozhadoop3.dal.moz.com:2181</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> </configuration> HDFS name nodes and data nodes work fine, as does map reduce.
... View more
Labels:
- Labels:
-
Apache HBase