Support Questions

jjo135 · ‎12-15-2016

I'm having an intermittent timeout error when running some example code against HBase. The basic Java application creates a Scanner and queries a particular HBase table. For small queries, my code works fine. But, when I increase the TimeRange of my query, I get intermittent timeout errors, as seen below. Googling and searching the forum has not yielded any plausible solutions. Does anyone have any idea what the source of this error might be, and how to mitigate it?


 Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
     [java] Thu Dec 15 10:33:43 EST 2016, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60304: row '' on table 'my-table-name' at region=my-table-name,,1481665174391.d068f4be09585cf831dbcd3a04664caf., hostname=hostname-007.localdomain.local,16020,1481747196976, seqNum=34514987
     [java]
     [java]     at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:195)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
     [java]     at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
     [java]     at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
     [java]     at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
     [java]     at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
     [java]     at test.MyTestClass.main(MyTestClass.java:70)
     [java] Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=60304: row '' on table 'my-table-name' at region=my-table-name,,1481665174391.d068f4be09585cf831dbcd3a04664caf., hostname=hostname-007.localdomain.local,16020,1481747196976, seqNum=34514987
     [java]     at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
     [java]     at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
     [java]     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
     [java]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
     [java]     at java.lang.Thread.run(Thread.java:745)
     [java] Caused by: java.io.IOException: Call to hostname-007.localdomain.local/10.0.0.106:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2, waitTime=60001, operationTimeout=60000 expired.
     [java]     at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1262)
     [java]     at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1230)
     [java]     at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
     [java]     at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
     [java]     at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:213)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
     [java]     at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:346)
     [java]     at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:320)
     [java]     at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
     [java]     ... 4 more
     [java] Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2, waitTime=60001, operationTimeout=60000 expired.
     [java]     at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)
     [java]     at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1204)
     [java]     ... 13 more
     [java] Java Result: 1

elserj · ‎12-15-2016

When you increase your time range, you have to read more data. HBase defines the maximum length of any RPC by the hbase.rpc.timeout property in hbase-site.xml. This defaults to 60s, and this limit is what you're hitting.

If you want to run a query that will scan over more data or generally take a long time (such as server-side filtering), you will have to increase hbase.rpc.timeout commensurately.

jjo135 · ‎12-15-2016

I've checked this, but I've already got these timeout values set to 18000 (3 mins), so, I don't see why I'm getting a 60s timeout

 cat /etc/hbase/conf/hbase-site.xml | grep -2 rpc.timeout

    <property>
      <name>hbase.rpc.timeout</name>
      <value>180000</value>
    </property>

elserj · ‎12-15-2016

Make sure that /etc/hbase/conf is included on your client's classpath.

jjo135 · ‎12-15-2016

I've confirmed that /etc/hbase/conf is on the classpath, and I've added the following code to my test script:

                Configuration conf = HBaseConfiguration.create();
                System.out.println("Timeout: " + conf.get("hbase.rpc.timeout"));

The above outputs 18000 as expected.

elserj · ‎12-15-2016

Ok, last check would be to verify that all of your RegionServers also have that configuration value. The easiest way is to venture to the HBase UI for each RegionServer (via the Master UI is the easiest) and verify that the value is set after clicking on "HBase Configuration" at the top of the page.

jjo135 · ‎12-15-2016

I've confirmed that this setting is the same across all machines in the cluster using the same command as above.

knarayanan · ‎12-15-2016

is the port 16020 open on the nodes. especially hostname-007.localdomain.local

elserj · ‎12-15-2016

He would not be seeing a SocketTimeoutException if the socket was unable to make a connection on that host+port. The SocketTimeoutException implies that the socket is connected.

jjo135 · ‎12-15-2016

Yes, I've confirmed the port is open

Cloudera Community

Support Questions

Intermittent Timeout Error When Querying HBase