Support Questions
Find answers, ask questions, and share your expertise

even a single node fails my hbase client is requesting for the data to that node for everone.

New Contributor

HI, Please find the below hbase-site.xml file in my hdp cluster.

env - AWS

logs message:

2022-03-05 10:09:58.242 [dw-8144 - POST /blitz-reader/metric-reader/v3.0] WARN c.a.b.p.c.m.r.MetricDataHBaseReadProcessor - org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3, exceptions:
Sat Mar 05 10:09:50 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:53 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:58 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, java.net.ConnectException: Connection refused

java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3, exceptions:
Sat Mar 05 10:09:50 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:53 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:58 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, java.net.ConnectException: Connection refused

at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.appdynamics.blitz.processor.core.metrics.reader.AbstractMetricDataReadProcessor.getMergedResult(AbstractMetricDataReadProcessor.java:63)
at com.appdynamics.blitz.processor.core.metrics.reader.MetricDataHBaseReadProcessor.getMetricData(MetricDataHBaseReadProcessor.java:355)
at com.appdynamics.blitz.service.MetricReporterService.readMetricData(MetricReporterService.java:344)
at com.appdynamics.blitz.service.MetricReporterService.readMetricDataBatchAvro(MetricReporterService.java:289)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:753)
at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:505)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3, exceptions:
Sat Mar 05 10:09:50 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:53 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:58 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, java.net.ConnectException: Connection refused

at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:158)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 common frames omitted
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupConnection(RpcClientImpl.java:410)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:716)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:889)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1201)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:218)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:292)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32831)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:383)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:208)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:211)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:396)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:370)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136)
... 4 common frames omitted
2022-03-05 10:09:58.242 [dw-8144 - POST /blitz-reader/metric-reader/v3.0] WARN c.a.b.p.c.m.r.MetricDataHBaseReadProcessor - HBaseReaderTask executor got interrupted during execution
2022-03-05 10:09:58.242 [dw-8144 - POST /blitz-reader/metric-reader/v3.0] ERROR c.a.b.service.MetricReporterService - Could not process the request: BlitzMetricBatchReadOperation{sourceKey='loss-detector-source-key', numEntities=1, numMetricsRequested=100, entityType=app, startTime=1646474640000, endTime=1646474700000, isAggregated=true, isBaseLine=false, seasonality=null, baseLineAskedStartTime=-1, baseLineAskedEndTime=-1, granularityMins=0, metricValueFilter=null, baseLineTimeZone=null, forceGranularity=false}
java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3, exceptions:
Sat Mar 05 10:09:50 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:53 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:58 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, java.net.ConnectException: Connection refused

at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.appdynamics.blitz.processor.core.metrics.reader.AbstractMetricDataReadProcessor.getMergedResult(AbstractMetricDataReadProcessor.java:63)
at com.appdynamics.blitz.processor.core.metrics.reader.MetricDataHBaseReadProcessor.getMetricData(MetricDataHBaseReadProcessor.java:355)
at com.appdynamics.blitz.service.MetricReporterService.readMetricData(MetricReporterService.java:344)
at com.appdynamics.blitz.service.MetricReporterService.readMetricDataBatchAvro(MetricReporterService.java:289)
at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:753)
at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:505)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=3, exceptions:
Sat Mar 05 10:09:50 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:53 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020
Sat Mar 05 10:09:58 UTC 2022, RpcRetryingCaller{globalStartTime=1646474990201, pause=1000, retries=3}, java.net.ConnectException: Connection refused

 

 

 

hbase-site.xml file

 

<configuration>

<property>
<name>dfs.client.socket-timeout</name>
<value>10000</value>
</property>

<property>
<name>dfs.domain.socket.path</name>
<value>/var/lib/hadoop-hdfs/dn_socket</value>
</property>

<property>
<name>hbase.block.data.cachecompressed</name>
<value>true</value>
</property>

<property>
<name>hbase.bucketcache.ioengine</name>
<value>offheap</value>
</property>

<property>
<name>hbase.bucketcache.size</name>
<value>4096</value>
</property>

<property>
<name>hbase.bulkload.staging.dir</name>
<value>/apps/hbase/staging</value>
</property>

<property>
<name>hbase.client.ipc.pool.size</name>
<value>20</value>
</property>

<property>
<name>hbase.client.ipc.pool.type</name>
<value>RoundRobinPool</value>
</property>

<property>
<name>hbase.client.keyvalue.maxsize</name>
<value>10485760</value>
</property>

<property>
<name>hbase.client.pause</name>
<value>1000</value>
</property>

<property>
<name>hbase.client.retries.number</name>
<value>35</value>
</property>

<property>
<name>hbase.client.scanner.caching</name>
<value>100</value>
</property>

<property>
<name>hbase.client.scanner.max.result.size</name>
<value>2097152</value>
</property>

<property>
<name>hbase.client.scanner.timeout.period</name>
<value>60000</value>
</property>

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>

<property>
<name>hbase.coprocessor.master.classes</name>
<value></value>
</property>

<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint</value>
</property>

<property>
<name>hbase.defaults.for.version.skip</name>
<value>true</value>
</property>

<property>
<name>hbase.hregion.majorcompaction</name>
<value>604800000</value>
</property>

<property>
<name>hbase.hregion.majorcompaction.jitter</name>
<value>0.50</value>
</property>

<property>
<name>hbase.hregion.max.filesize</name>
<value>107374182400</value>
</property>

<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>4</value>
</property>

<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>134217728</value>
</property>

<property>
<name>hbase.hregion.memstore.mslab.enabled</name>
<value>true</value>
</property>

<property>
<name>hbase.hstore.blockingStoreFiles</name>
<value>200</value>
</property>

<property>
<name>hbase.hstore.compaction.max</name>
<value>10</value>
</property>

<property>
<name>hbase.hstore.compactionThreshold</name>
<value>5</value>
</property>

<property>
<name>hbase.ipc.server.tcpnodelay</name>
<value>true</value>
</property>

<property>
<name>hbase.lease.recovery.dfs.timeout</name>
<value>23000</value>
</property>

<property>
<name>hbase.local.dir</name>
<value>${hbase.tmp.dir}/local</value>
</property>

<property>
<name>hbase.master.info.bindAddress</name>
<value>0.0.0.0</value>
</property>

<property>
<name>hbase.master.info.port</name>
<value>60010</value>
</property>

<property>
<name>hbase.master.loadbalance.bytable</name>
<value>true</value>
</property>

<property>
<name>hbase.master.namespace.init.timeout</name>
<value>2400000</value>
</property>

<property>
<name>hbase.master.port</name>
<value>60000</value>
</property>

<property>
<name>hbase.master.ui.readonly</name>
<value>false</value>
</property>

<property>
<name>hbase.master.wait.on.regionservers.timeout</name>
<value>60000</value>
</property>

<property>
<name>hbase.regionserver.checksum.verify</name>
<value>true</value>
</property>

<property>
<name>hbase.regionserver.executor.openregion.threads</name>
<value>20</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.lowerLimit</name>
<value>0.40</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.size</name>
<value>0.4</value>
</property>

<property>
<name>hbase.regionserver.global.memstore.upperLimit</name>
<value>0.55</value>
</property>

<property>
<name>hbase.regionserver.handler.count</name>
<value>60</value>
</property>

<property>
<name>hbase.regionserver.info.port</name>
<value>60030</value>
</property>

<property>
<name>hbase.regionserver.maxlogs</name>
<value>107</value>
</property>

<property>
<name>hbase.regionserver.optionalcacheflushinterval</name>
<value>0</value>
</property>

<property>
<name>hbase.regionserver.port</name>
<value>60020</value>
</property>

<property>
<name>hbase.regionserver.region.split.policy</name>
<value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
</property>

<property>
<name>hbase.regionserver.storefile.refresh.period</name>
<value>20</value>
</property>

<property>
<name>hbase.regionserver.thread.compaction.large</name>
<value>2</value>
</property>

<property>
<name>hbase.regionserver.thread.compaction.small</name>
<value>2</value>
</property>

<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.WALCellCodec</value>
</property>

<property>
<name>hbase.regionserver.wal.enablecompression</name>
<value>true</value>
</property>

<property>
<name>hbase.rootdir</name>
<value>hdfs://devCluster1/hbase</value>
</property>

<property>
<name>hbase.rpc.protection</name>
<value>authentication</value>
</property>

<property>
<name>hbase.rpc.timeout</name>
<value>60000</value>
</property>

<property>
<name>hbase.rs.cacheblocksonwrite</name>
<value>true</value>
</property>

<property>
<name>hbase.security.authentication</name>
<value>simple</value>
</property>

<property>
<name>hbase.security.authorization</name>
<value>false</value>
</property>

<property>
<name>hbase.superuser</name>
<value>hbase</value>
</property>

<property>
<name>hbase.tmp.dir</name>
<value>/data/hadoop/hbase</value>
</property>

<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>

<property>
<name>hbase.zookeeper.quorum</name>
<value>a.compute.internal,b.compute.internal,c.compute.internal</value>
</property>

<property>
<name>hbase.zookeeper.useMulti</name>
<value>true</value>
</property>

<property>
<name>hfile.block.bloom.cacheonwrite</name>
<value>true</value>
</property>

<property>
<name>hfile.block.cache.size</name>
<value>0.2</value>
</property>

<property>
<name>hfile.block.index.cacheonwrite</name>
<value>true</value>
</property>

<property>
<name>phoenix.query.timeoutMs</name>
<value>60000</value>
</property>

<property>
<name>zookeeper.recovery.retry</name>
<value>1</value>
</property>

<property>
<name>zookeeper.session.timeout</name>
<value>30000</value>
</property>

<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
</property>

</configuration>

3 REPLIES 3

Super Collaborator

Hello @Suresh_lakavath 

 

Thanks for using Cloudera Community. Your concern is even a Single Node fails your HBase Client request. Please note that your HBase Client request referring to "ip-10-145-250-154.us-west-2.compute.internal" (Along with other Region Servers) would fail. Any HBase Client request referencing other Region Servers outside "ip-10-145-250-154.us-west-2.compute.internal" would be Successful. 

 

Your Team's Action Plan should ideally be to identify the reasoning of the Failed Server List of "ip-10-145-250-154.us-west-2.compute.internal" (Which would require HMaster Logs & Region Server Logs covering the Timestamp of the "Failed Server List"). 

 

Regards, Smarak

 

[1] org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: ip-10-145-250-154.us-west-2.compute.internal/10.145.250.154:60020

Super Collaborator

Hello @Suresh_lakavath 

 

Hope you are doing well. We wish to follow up with you on this Post. 

 

Regards, Smarak

Super Collaborator

Hello @Suresh_lakavath 

 

Since we haven't heard from your side concerning the Post, We are marking the Post as Closed for now. Feel free to Update the Post based on your Team's observation from the Action Plan shared on 03/09. 

 

Regards, Smarak

; ;