Reply
Explorer
Posts: 8
Registered: ‎06-14-2017
Accepted Solution

Error when stressing the cluster

Hi,

 

We are stressing the Kudu cluster (inserting a lot of information) and we are getting errors of timeouts when inserting the data in the tablets:

 

 

W0807 12:53:47.136150 31391 meta_cache.cc:207] Tablet d687c05ffe5e48d19fbfe2f71bd136f7: Replica 0cf3c1866a094ee0b2305bca770f5e70 (bigdata09dev:7050) has failed: Timed out: Write RPC to 192.168.10.124:7050 timed out after 9.989s (SENT)
W0807 12:53:47.136211 31391 batcher.cc:329] Timed out: Failed to write batch of 805 ops to tablet d687c05ffe5e48d19fbfe2f71bd136f7 after 1 attempt(s): Failed to write to server: 0cf3c1866a094ee0b2305bca770f5e70 (bigdata09dev:7050): Write RPC to 192.168.10.124:7050 timed out after 9.989s (SENT)

 

 

This is causing data loss. My question is: Is the only option to avoid this (avoid data loss) to control the errors by software when programming the loader and retrying the insert? Or is it possible to configure the cluster to retry the insert by default until it gets loaded?

 

Thank you very much and best regards

Cloudera Employee
Posts: 51
Registered: ‎09-28-2015

Re: Error when stressing the cluster

Hi,

If you simply increase your timeout, the client itself has built-in retries
and will keep trying to complete the insert until the given time has
elapsed. In a scenario that is not latency-sensitive I would recommend
increasing the timeout to a minute or two.

-Todd
Explorer
Posts: 8
Registered: ‎06-14-2017

Re: Error when stressing the cluster

Thanks a lot. And do you know how to change that timeout?

Cloudera Employee
Posts: 51
Registered: ‎09-28-2015

Re: Error when stressing the cluster

It looks like you're using the C++ client. Given that, you can use the
KuduSession::SetTimeout() API:

https://kudu.apache.org/cpp-client-api/classkudu_1_1client_1_1KuduSession.html#a25b22362650d7120f59c...

-Todd
Announcements