Reply
New Contributor
Posts: 5
Registered: ‎07-27-2018

Getting Row Error for Primary key error on kudu upsert via Steamsets

I have a kafka-->kudu pipeline that had been running for about two weeks without issue.   Yesterday the streamsets pipeline started failing 

 

Here's a snippet of the pipeline log:

 

Error while running: com.streamsets.pipeline.api.StageException: KUDU_03 - Errors while interacting with Kudu: Row error for primary key=[-128, 0, 0, 0, 111, -92, -60, -24], tablet=null, server=null, status=Timed out: can not complete before timeout: Batch{operations=193, tablet="84c0c8c073e14c26b0a3da415a84dc53" [0x00000001, 0x00000002), ignoreAllDuplicateRows=false, rpc=KuduRpc(method=Write, tablet=84c0c8c073e14c26b0a3da415a84dc53, attempt=25, DeadlineTracker(timeout=10000, elapsed=9691), Traces: [0ms] sending RPC to server 053a1bbcc6b243b0a9c90f37b336fac1, [12ms] received from server 053a1bbcc6b243b0a9c90f37b336fac1 response Service unavailable: Service unavailable: Soft memory limit exceeded (at 99.05% of capacity). See https://kudu.apache.org/releases/1.6.... 

 

I'm also seeing things in the logs about removing servers from a tablet's cache and WebSocket queue is full, discarding 'status' message.   I've done a preview on the data in Streamsets and the PK field is populated.   Streamsets itself is not giving me any information about the record itself that is failing, it just fails the entire pipeline.

 

At this point I'm not even sure where to look.

Posts: 1,754
Kudos: 371
Solutions: 279
Registered: ‎07-31-2013

Re: Getting Row Error for Primary key error on kudu upsert via Steamsets

> Soft memory limit exceeded

This is classically caused by exhaustion of the memory granted to Kudu via Kudu - Configuration - Kudu Tablet Server Hard Memory Limit property. What is it set to, and have you had this repeat after raising it?

> I'm also seeing things in the logs about removing servers from a tablet's cache and WebSocket queue is full, discarding 'status' message.

Could you share some of these log snippets so we can analyse them more specifically? They don't sound directly related to your issue, so having the full log lines would help ascertain if they are the cause behind the server-side rejection of the inserts.
Master
Posts: 326
Registered: ‎07-01-2015

Re: Getting Row Error for Primary key error on kudu upsert via Steamsets

Regarding soft limit exceeded: I had the same issue, I suppose you are not running on the latest CDH5 version (5.15). The memory on Kudu tablet servers (in my case) was not released, even when the injection stopped and no workload was running against the Kudu cluster at all. I was told that the newer version of Kudu should handle better the memory allocation. You can find the detailed memory consumption on the tablets server's UI. My solution was to decrease the number of tablets and the number of tables from Kudu.
Highlighted
New Contributor
Posts: 5
Registered: ‎07-27-2018

Re: Getting Row Error for Primary key error on kudu upsert via Steamsets

Here is an example of that log entry.   There are tons of them in my logs.   I'm on Streamsets 3.4.2 and CDH 5.14.  I would love to understand what the root cause of this is.   

 

Removing server 053a1bbcc6b243b0a9c90f37b336fac1 from this tablet's cache 747423b5bf834fbb9a6508aae8eb1f63AsyncKuduClient*admin0New I/O worker #965
Announcements
New solutions