Created on 09-06-2018 09:23 AM - edited 09-16-2022 06:40 AM
I have a kafka-->kudu pipeline that had been running for about two weeks without issue. Yesterday the streamsets pipeline started failing
Here's a snippet of the pipeline log:
Error while running: com.streamsets.pipeline.api.StageException: KUDU_03 - Errors while interacting with Kudu: Row error for primary key=[-128, 0, 0, 0, 111, -92, -60, -24], tablet=null, server=null, status=Timed out: can not complete before timeout: Batch{operations=193, tablet="84c0c8c073e14c26b0a3da415a84dc53" [0x00000001, 0x00000002), ignoreAllDuplicateRows=false, rpc=KuduRpc(method=Write, tablet=84c0c8c073e14c26b0a3da415a84dc53, attempt=25, DeadlineTracker(timeout=10000, elapsed=9691), Traces: [0ms] sending RPC to server 053a1bbcc6b243b0a9c90f37b336fac1, [12ms] received from server 053a1bbcc6b243b0a9c90f37b336fac1 response Service unavailable: Service unavailable: Soft memory limit exceeded (at 99.05% of capacity). See https://kudu.apache.org/releases/1.6....
I'm also seeing things in the logs about removing servers from a tablet's cache and WebSocket queue is full, discarding 'status' message. I've done a preview on the data in Streamsets and the PK field is populated. Streamsets itself is not giving me any information about the record itself that is failing, it just fails the entire pipeline.
At this point I'm not even sure where to look.
Created 09-09-2018 07:26 PM
Created 09-13-2018 09:40 AM
Here is an example of that log entry. There are tons of them in my logs. I'm on Streamsets 3.4.2 and CDH 5.14. I would love to understand what the root cause of this is.
Removing server 053a1bbcc6b243b0a9c90f37b336fac1 from this tablet's cache 747423b5bf834fbb9a6508aae8eb1f63 | AsyncKuduClient | *admin | 0 | New I/O worker #965 |
Created 09-10-2018 01:15 AM