Hello,
After 13th January, all my store streamsets stopped storing data with an error similar like this below.
We are collecting data from ETW and IIS log data, that is first collect, then transform, and finally store. This last step is falling.
Can someone help?
KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found
-----------------------------------
com.streamsets.pipeline.api.base.OnRecordErrorException: KUDU_15 - Row key '(string id="3d755d2d-949e-4fba-9925-63f658b44773", int32 year=2020, int32 month=2, int32 day=21, string host="vipe-exch03", string type="etw", string environment="prod", string groupid="iis")' not found at com.streamsets.pipeline.stage.destination.kudu.KuduTarget.writeBatch(KuduTarget.java:458) at com.streamsets.pipeline.stage.destination.kudu.KuduTarget.write(KuduTarget.java:327) at com.streamsets.pipeline.configurablestage.DTarget.write(DTarget.java:34) at com.streamsets.datacollector.runner.StageRuntime$2.call(StageRuntime.java:253) at com.streamsets.datacollector.runner.StageRuntime$2.call(StageRuntime.java:237) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:195) at com.streamsets.datacollector.runner.StageRuntime.execute(StageRuntime.java:264) at com.streamsets.datacollector.runner.StagePipe.process(StagePipe.java:226) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.processPipe(ProductionPipelineRunner.java:751) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.lambda$runSourceLessBatch$3(ProductionPipelineRunner.java:779) at com.streamsets.datacollector.runner.PipeRunner.executeBatch(PipeRunner.java:132) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runSourceLessBatch(ProductionPipelineRunner.java:778) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.runPollSource(ProductionPipelineRunner.java:550) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunner.run(ProductionPipelineRunner.java:375) at com.streamsets.datacollector.runner.Pipeline.run(Pipeline.java:504) at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:112) at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74) at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:754) at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:173) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:227) at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33) at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:223) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at com.streamsets.datacollector.metrics.MetricSafeScheduledExecutorServi
Created 03-01-2020 04:44 PM
Just found what the problem was. The table was divided in partitions by months and some tables reached the already monthly pré-created partitions. The resolution passed by creating new partitions and the data started to by stored again.
Created 02-24-2020 08:37 AM
Are you a StreamSets customer? If so, you should open a support ticket.
Created 02-24-2020 08:40 AM
Looking at the error, it seems like you're trying to process an insert as an update. Check the logic of your pipeline. If it was working before, something must have changed.
Created 03-01-2020 04:44 PM
Just found what the problem was. The table was divided in partitions by months and some tables reached the already monthly pré-created partitions. The resolution passed by creating new partitions and the data started to by stored again.