Member since
04-13-2016
36
Posts
4
Kudos Received
0
Solutions
01-22-2017
02:36 PM
I am launching hbase (1.1.2) on a kerberized cluster (AD). Hbase region server fails to connect to master with following error: 2017-01-20 18:17:23,944 WARN [regionserver/a1.example.com/xxxxx] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.io.IOException: Couldn't setup connection for srvuser/a1.example.com@ADC.EXAMPLE.COM to srvuser/a2.example.com@ADC.EXAMPLE.COM
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2270) ... Caused by: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): GSS initiate failed
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.readStatus(HBaseSaslRpcClient.java:153)
at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:189) I turned ON detailed debug logs for kerberos as well as Hbase. I can see that the service ticket is successfully obtained by host a1 for a2: Found ticket for srvuser/a1.example.com@ADC.EXAMPLE.COM to go to krbtgt/ADC.EXAMPLE.COM@ADC.EXAMPLE.COM expiring on Sat Jan 21 04:17:10 PST 2017 Found ticket for srvuser/a1.example.com@ADC.EXAMPLE.COM to go to srvuser/a2.example.com@ADC.EXAMPLE.COM expiring on Sat Jan 21 04:17:10 PST 2017 Client Principal = srvuser/a1.example.com@ADC.EXAMPLE.COM Server Principal = srvuser/a2.example.com@ADC.EXAMPLE.COM Session Key = EncryptionKey: keyType=23 keyBytes I do not see any errors post the above lines in detailed kerberos level logs so I assume that the problem of GSS Initiate failed has not anything to do with kerberos now else I would have seen some error reported (such as say ticket being corrupted?) I notice that GSS Initiate failed message without any details reported is specified by experts as one of the most useless messages - Steve's error messages to fear. Already verified unlimited JCE policy files are present, and that both hosts are using the same encryption algorithm. Can anyone help here? Even if it is about what next steps I can take to debug this? Thank you!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache HBase
09-22-2016
04:21 AM
@Artem Ervits @Neeraj Sabharwal - I am trying to leverage size-based throttling but keep getting ThrottlingException when I start hbase, even when there is hardly any data in hbase. I am sure this is some mis-configuration from my end but I cannot seem to find that out. Any inputs would be appreciated. Just to also add there is some correlation here between number of pre-splits and throttling size limit because the error shows up only when number of pre-splits are more.
Details : Hbase version : 1.1.2, Number of region servers :4, Number of regions : 116, HeapMemory for Region Server : 2GB Quotas set : TABLE => ns1:table1 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10G/sec, SCOPE => MACHINE TABLE => ns2:table2 TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, LIMIT => 10G/sec, SCOPE => MACHINE Region server stack trace (notice below that the error is about read size limit exceeded, and later the size of scan is only 28 (bytes?). Stack trace:- 2016-09-17 22:35:40,674 DEBUG [B.defaultRpcServer.handler=55,queue=1,port=58526] quotas.RegionServerQuotaManager: Throttling exception for user=root table=ns1:table1 numWrites=0 numReads=0 numScans=1: read size limit exceeded - wait 0.00sec 2016-09-17 22:35:40,676 DEBUG [B.defaultRpcServer.handler=55,queue=1,port=58526] ipc.RpcServer: B.defaultRpcServer.handler=55,queue=1,port=58526: callId: 52 service: ClientService methodName: Scan size: 28 connection: 10.65.141.170:42806
org.apache.hadoop.hbase.quotas.ThrottlingException: read size limit exceeded - wait 0.00sec
at org.apache.hadoop.hbase.quotas.ThrottlingException.throwThrottlingException(ThrottlingException.java:107)
... View more
08-03-2016
03:38 AM
Hi @billie - Thanks. Actually, I was able to get that part working (and yes, the changes are needed both in appConfig as well as metainfo). However, when there are more than 1 region servers started on the same host (different ports), then slider gives wrong info about the port of 1st region server. Other region server ports are correct. I think that should be a bug in slider.
... View more
07-31-2016
06:59 AM
I am using hbase 0.98 and slider 0.81.1 I want to be able to use slider REST APIs to get port number for region server instances deployed through slider. I assume the API to use is https://inldmqarh71n2:8090/proxy/application_1467115608017_0178/ws/v1/slider/publisher/exports/servers However, I get NullPointerException when I issue this API call. Do I need to specify anything specific in metaInfo.xml to make this work? Or is it slider version issue?
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache YARN
06-08-2016
10:26 AM
2 Kudos
I've been having following questions about SmartSense. Would anyone who has used it, be able to help? 1. We host hbase as YARN app and use slider for the same. I notice that SmartSense has a support for hbase monitoring/ troubleshooting. Just wanted to know if that is extensible to Hbase on Yarn too? 2. Does SmartSense help with piecing together troubleshooting information from so different logs? For example, YARN container app may be down because Yarn node manager went down, which in turn may be down because Yarn RM terminated all apps on that node manager. Piecing this information today requires looking into resource manager/ node manager logs along with Hbase logs. Another case is that of say, an app going down because ZooKeeper has hit the maxClientCnxns issues and would not allow any more incoming connections from that host. Those are just representative set of problems. Does SmartSense help there? 3. Does SmartSense also help identify issues such as Kerberos ticket renewal issue, SSL issues, open file handles issues? Thanks, Sumit
... View more
Labels:
- Labels:
-
Apache HBase
-
Hortonworks SmartSense
05-03-2016
04:18 AM
@billie - Thank you for the info. So, it is exactly as I thought. And in my opinion ps is completely wrong in the context of hbase because even with ps coming back successfully, the region server is dead for all practical purposes. Unfortunately, because of this my idea of reducing heartbeat.monitor.interval will also not make too much difference because ps will be fine.
... View more
05-02-2016
02:52 PM
@Devaraj Das - Is there any way that you are aware through which I can find the mechanism used by slider to heartbeat the container? I am being told that it can take up to 15-20 minutes to get back the container.
... View more
05-02-2016
02:49 PM
Ok, I figured there are setting which can control whether we want block cache invalidation when major compaction happens. In my case that setting is disabled, however.
... View more
05-02-2016
02:44 PM
Hey @rmaruthiyodan - Thanks. Yes, I had to use /proc to find region server PID specific limits. Basically, ambari restricts this number to 32K by default and this can be overridden in blueprint being submitted.
... View more
04-30-2016
12:51 PM
@nmaillard - Thanks. Yes, I am aware of lsof and was planning to use it. Also could there be a setting in hbase which restricts number of open file handles in hbase itself and throws this error? Also, you meant /proc/sys/fs/file-max? Thanks
... View more