Support Questions

cjervis · ‎02-21-2020

Hello,

After 13th January, all my store streamsets stopped storing data with an error similar like this below.

We are collecting data from ETW and IIS log data, that is first collect, then transform, and finally store. This last step is falling.

Can someone help?

KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found

-----------------------------------

KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found

com.streamsets.pipeline.api.base.OnRecordErrorException: KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found
	at com.streamsets.pipeline.stage.destination.kudu.KuduTarget.writeBatch(KuduTarget.java:458)
	at com.streamsets.pipeline.stage.destination.kudu.KuduTarget.write(KuduTarget.java:327)
	at com.streamsets.pipeline.configurablestage.DTarget.write(DTarget.java:34)
	at com.streamsets.datacollector.runner.StageRuntime$2.call(StageRuntime.java:253)
	at com.streamsets.datacollector.runner.StageRuntime$2.call(StageRuntime.java:237)
	at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:195)
	at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:264)
	at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:226)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.processPipe(ProductionPipelineRunner.java:751)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.lambda$runSourceLessBatch$3(ProductionPipelineRunner.java:779)
	at com.streamsets.datacollector.runner.PipeRunner.executeBatch(PipeRunner.java:132)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runSourceLessBatch(ProductionPipelineRunner.java:778)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runPollSource(ProductionPipelineRunner.java:550)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.run(ProductionPipelineRunner.java:375)
	at com.streamsets.datacollector.runner.Pipeline.run(Pipeline.java:504)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:112)
	at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74)
	at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:754)
	at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:173)
	at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:227)
	at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
	at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:223)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorServi

GAmaro · ‎03-01-2020

Just found what the problem was. The table was divided in partitions by months and some tables reached the already monthly pré-created partitions. The resolution passed by creating new partitions and the data started to by stored again.

View solution in original post

metadaddy · ‎02-24-2020

Are you a StreamSets customer? If so, you should open a support ticket.

metadaddy · ‎02-24-2020

Looking at the error, it seems like you're trying to process an insert as an update. Check the logic of your pipeline. If it was working before, something must have changed.

GAmaro · ‎03-01-2020

Just found what the problem was. The table was divided in partitions by months and some tables reached the already monthly pré-created partitions. The resolution passed by creating new partitions and the data started to by stored again.

Cloudera Community

Support Questions

Error storing data "Row key not found"

Support Questions

Error storing data "Row key not found"

Error Stack Trace

KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found