Support Questions

Find answers, ask questions, and share your expertise

hbase warning response too slow

avatar
Contributor

What does the following mean

 

WARN November 10, 2015 7:42 PM RpcServer
(responseTooSlow): {"processingtimems":11063,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"10.125.122.203:41488","starttimems":1447206125517,"queuetimems":1,"class":"HRegionServer","responsesize":199,"method":"Multi"}
1 ACCEPTED SOLUTION

avatar
Mentor
To evaluate a responseTooSlow, look at the reported parameters:

"processingtimems":11063,
"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
MultiRequest)",
"client":"10.125.122.203:41488",
"starttimems":1447206125517,
"queuetimems":1,
"class":"HRegionServer",
"responsesize":199,
"method":"Multi"

To summarise: The "Multi" type request from a client on 10.125.122.203 to
the region server in question (where you picked this log from) waited 1ms
(queue-time-ms) in the queue to get picked up for processing (which isn't
bad at all) but took 11063ms (processing-time-ms) or ~11s to complete being
processed and transmitted back. The transmitted response was 199 bytes
(response-size).

Your RS found such an overall time to be too high (more than a couple
seconds for ex.) so it decided to log a warning.

To evaluate why it took so long though, depends on knowledge of what
multi(…) (or other types) request was the client on 10.125.122.203
attempting to send, and did it carry a lot of rows, thereby requiring a lot
more time to process (a simple question). If the client's not at suspect,
then the RS needs to be investigated on why it took so long to process this
request, i.e. Did it suffer GC pauses? Did some operations within multi(…)
require to wait on region locks? Was there ongoing blockage due to flushing
that may have held some form of locks?

View solution in original post

2 REPLIES 2

avatar
Mentor
To evaluate a responseTooSlow, look at the reported parameters:

"processingtimems":11063,
"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$
MultiRequest)",
"client":"10.125.122.203:41488",
"starttimems":1447206125517,
"queuetimems":1,
"class":"HRegionServer",
"responsesize":199,
"method":"Multi"

To summarise: The "Multi" type request from a client on 10.125.122.203 to
the region server in question (where you picked this log from) waited 1ms
(queue-time-ms) in the queue to get picked up for processing (which isn't
bad at all) but took 11063ms (processing-time-ms) or ~11s to complete being
processed and transmitted back. The transmitted response was 199 bytes
(response-size).

Your RS found such an overall time to be too high (more than a couple
seconds for ex.) so it decided to log a warning.

To evaluate why it took so long though, depends on knowledge of what
multi(…) (or other types) request was the client on 10.125.122.203
attempting to send, and did it carry a lot of rows, thereby requiring a lot
more time to process (a simple question). If the client's not at suspect,
then the RS needs to be investigated on why it took so long to process this
request, i.e. Did it suffer GC pauses? Did some operations within multi(…)
require to wait on region locks? Was there ongoing blockage due to flushing
that may have held some form of locks?

avatar
Contributor

Thanks for the input 

I am suspecting that it may be the region settings I am using that prematurely splits regions contributing to the latency

Originally I set the regions at default to 10 gb, shrank it to 4 gb did a compaction and flush, the first writes where trying to reorg the regions 

which is stemming from my first problem 

http://community.cloudera.com/t5/Storage-Random-Access-HDFS/hbase-closed-upon-write/m-p/34001#U34001

I then increased the regions to 4.9 gb seems to be fine now

The reason I am staying under 5 gb, is so that I can transfer to S3 which has a file size limit