Created on 07-21-2017 05:36 AM - edited 09-16-2022 04:58 AM
Hi,
When I run delete from kudu table, after deleting some rows, it return an error, and it continues deleting, but not all the rows :
> select count(*) from test2p;
+----------+
| count(*) |
+----------+
| 50000000 |
+----------+
> delete from test2p;
WARNINGS: Kudu error(s) reported, first error: Timed out: Failed to write batch
of 3759 ops to tablet 871c8123905c4e529a233c18751f8154 after 1 attempt(s):
Failed to write to server: fb8ca29738c541hc80ed2da98a6e6499 (nod7.exp:7050):
Write RPC to X.X.X.X:7050 timed out after 179.998s (SENT)
Error in Kudu table 'impala::kudutest.test2p': Timed out: Failed to write
batch of 3759 ops to tablet 871c8123905c4e529a233c18751f8154 after 1 attempt(s)
: Failed to write to server: fb8ca29738c541hc80ed2da98a6e6499 (nod7.exp:7050):
Write RPC to X.X.X.X:7050 timed out after 179.998s (SENT) (1 of 3273
similar)
> select count(*) from test2p;
+----------+
| count(*) |
+----------+
| 38591543 |
+----------+
....
> select count(*) from test2p;
+----------+
| count(*) |
+----------+
| 35220774 |
+----------+
In ,pd7.exp log file (/var/log/kudu/kudu-tserver.WARNING), I have this error message:
Metrics: {"negotiator.queue_time_us":211,"thread_start_us":193,"threads_started":1} W0721 12:05:45.431758 28220 negotiation.cc:303] Failed RPC negotiation. Trace: 0721 12:05:45.427384 (+ 0us) reactor.cc:446] Submitting negotiation task for server connection from X.X.X.X:34727 0721 12:05:45.427608 (+ 224us) server_negotiation.cc:167] Beginning negotiation 0721 12:05:45.427609 (+ 1us) server_negotiation.cc:355] Waiting for connection header 0721 12:05:45.428640 (+ 1031us) server_negotiation.cc:363] Connection header received 0721 12:05:45.429483 (+ 843us) server_negotiation.cc:319] Received NEGOTIATE NegotiatePB request 0721 12:05:45.429484 (+ 1us) server_negotiation.cc:404] Received NEGOTIATE request from client 0721 12:05:45.429498 (+ 14us) server_negotiation.cc:331] Sending NEGOTIATE NegotiatePB response 0721 12:05:45.429539 (+ 41us) server_negotiation.cc:188] Negotiated authn=TOKEN 0721 12:05:45.431691 (+ 2152us) negotiation.cc:294] Negotiation complete: Network error: Server connection negotiation failed: server connection from X.X.X.X:34727: BlockingRecv error: Recv() got EOF from remote (error 108)
Thnaks in advance.
Created 07-27-2017 11:33 AM
Can you give it a try changing the encoding of your primary key int column to 'PLAIN_ENCODING' instead of the default AUTO_ENCODING? I think that should resolve your problem (at the expense of some disk space)
Created 07-21-2017 08:26 AM
In Kudu, doing a delete like this is basically like inserting as many rows as you are deleting, so this might not be what you want to do.
In any case, we'd need a lot more logs from nod7.exp to understand what's going on. My guess is that Kudu doesn't have enough memory, sometimes CM can configure it to use 1GB which is below the safe minimum.
Created 07-21-2017 09:08 AM
Firstly, thank you @J-D to the answer.
So what kind of delete I can use instead of :
delete from test2p;
What's types of logs can I give you to have lighter vision of the problem source ?
For the memory, in effect the nodes has
memory_limit_hard_bytes=5.5 Go
& thanks again.
Created 07-21-2017 09:16 AM
Dropping the table and recreating it would be a lot more efficient than deleting every row one by one after reading them (which is what "delete from" does).
All the logs that pertain to tablet 871c8123905c4e529a233c18751f8154 for that tablet server could be useful. The full INFO file would be perfect if you can zip it and host it on a server somewhere.
Created 07-21-2017 09:30 AM
Yes I understand that dropping table will be perfect in this case, but the problem is I'm facing this error message also in a kind of delete like:
delete from test2p2 where id < 300000000;
Okay brother, I'm collecting the 871c8123905c4e529a233c18751f8154 logs, and I will post them in a few minutes.
Created 07-21-2017 09:36 AM
Deletes like this should be fine, even more ideal when you need to delete huge batches of rows is to hopefully have range partitions setup so that you can just drop them (see this doc), but for millions of rows it shouldn't timeout unless there's something misconfigured.
Created 07-21-2017 10:31 AM
Hi
Here it is the logs file :
kudu-tserver.INFO:
link1: http://www41.zippyshare.com/v/xRs7t60g/file.html
link2: https://ufile.io/mwia8
kudu-tserver.WARNING:
link1: http://www41.zippyshare.com/v/iFSgcXO0/file.html
link2: https://ufile.io/z4bqf
& Thanks for your help J-D.
Created 07-25-2017 08:05 AM
After a discussion in Kudu slack channel, we have found that it's concern a bug, and @JD Cryans published it in Apache's JIRA to fix it in the next version.
https://issues.apache.org/jira/browse/KUDU-2076
We hope to find a solution as the soon as possible.
Created 07-27-2017 11:33 AM
Can you give it a try changing the encoding of your primary key int column to 'PLAIN_ENCODING' instead of the default AUTO_ENCODING? I think that should resolve your problem (at the expense of some disk space)