Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

CopyTable operation timesout


I am trying to copy a table from the source based on the particular time stamp to the destination cluster. But the MapReduce job fails with timeout exception. I see error

```Error: org.apache.hadoop.hbase.client.ScannerTimeoutException: 75871ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.ClientScanner.loadCache( at at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue( at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue( at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue( at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue( at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue( at$Context.nextKeyValue( at at org.apache.hadoop.mapred.MapTask.runNewMapper( at at org.apache.hadoop.mapred.YarnChild$ at Method) at at at org.apache.hadoop.mapred.YarnChild.main( Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 12213230, already closed? at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan( at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod( at at at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( at org.apache.hadoop.hbase.ipc.RpcExecutor$ at```


@prarthana basgod

As the official HBase book states:

You may need to find a sweet spot between a low number of RPCs and the memory used on the client and server. Setting the scanner caching higher will improve scanning performance most of the time, but setting it too high can have adverse effects as well: each call to next() will take longer as more data is fetched and needs to be transported to the client, and once you exceed the maximum heap the client process has available it may terminate with an OutOfMemoryException. When the time taken to transfer the rows to the client, or to process the data on the client, exceeds the configured scanner lease threshold, you will end up receiving a lease expired error, in the form of a ScannerTimeoutException being thrown.

So it would be better not to avoid the exception by the above configuration, but to set the caching of your Map side lower, enabling your mappers to process the required load into the pre-specified time interval.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.