Support Questions

Find answers, ask questions, and share your expertise

Select query on ExecuteSQL processor throws out ERROR duplicate filed on joining two tables

avatar
Rising Star

I have a situation, where I have to get the combined data from two tables  from Mysql database, ExecuteSql processor throwing out bellow error,  where as the same query is working fine from Mysql Workbench.

 

2020-04-22 14:55:06,597 ERROR [Timer-Driven Process Thread-8] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=9d81c7ae-0171-1000-f638-2f1e60bfe48e] Unable to execute SQL select query select
t1.CODE_CLOCK_ID,
t1.COMPANY_CD,
t1.GROUP_SEGMENT_L1,
t1.GROUP_SEGMENT_CD,
t2.CODE_CLOCK_ID,
t2.COMPANY_CD,
t2.GROUP_SEGMENT_L1,
t2.GROUP_SEGMENT_CD
from
sheet26 t1
INNER JOIN sheet27 t2
on t1.CODE_CLOCK_ID = t2.CODE_CLOCK_ID and t1.COMPANY_CD = t2.COMPANY_CD LIMIT 0, 1000; due to org.apache.nifi.processor.exception.ProcessException: org.apache.avro.AvroRuntimeException: Duplicate field CODE_CLOCK_ID in record any.data.sheet26: CODE_CLOCK_ID type:UNION pos:4 and CODE_CLOCK_ID type:UNION pos:0.. No FlowFile to route to failure: org.apache.nifi.processor.exception.ProcessException: org.apache.avro.AvroRuntimeException: Duplicate field CODE_CLOCK_ID in record any.data.sheet26: CODE_CLOCK_ID type:UNION pos:4 and CODE_CLOCK_ID type:UNION pos:0.
org.apache.nifi.processor.exception.ProcessException: org.apache.avro.AvroRuntimeException: Duplicate field CODE_CLOCK_ID in record any.data.sheet26: CODE_CLOCK_ID type:UNION pos:4 and CODE_CLOCK_ID type:UNION pos:0.
at org.apache.nifi.processors.standard.AbstractExecuteSQL.lambda$onTrigger$1(AbstractExecuteSQL.java:301)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2746)
at org.apache.nifi.processors.standard.AbstractExecuteSQL.onTrigger(AbstractExecuteSQL.java:297)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: org.apache.avro.AvroRuntimeException: Duplicate field CODE_CLOCK_ID in record any.data.sheet26: CODE_CLOCK_ID type:UNION pos:4 and CODE_CLOCK_ID type:UNION pos:0.
at org.apache.avro.Schema$RecordSchema.setFields(Schema.java:651)
at org.apache.avro.SchemaBuilder$FieldAssembler.endRecord(SchemaBuilder.java:2013)
at org.apache.nifi.util.db.JdbcCommon.createSchema(JdbcCommon.java:636)
at org.apache.nifi.util.db.JdbcCommon.convertToAvroStream(JdbcCommon.java:239)
at org.apache.nifi.processors.standard.sql.DefaultAvroSqlWriter.writeResultSet(DefaultAvroSqlWriter.java:49)
at org.apache.nifi.processors.standard.AbstractExecuteSQL.lambda$onTrigger$1(AbstractExecuteSQL.java:299)
... 13 common frames omitted

 

what could be issue and what I have to change in query to get the record set from database?

I am using Nifi 1.11.4.

 

This issue happens even excluding INNER or even  replacing INNER JOIN with "where" in the select query.

2 ACCEPTED SOLUTIONS

avatar
Expert Contributor

maybe you should set for all or just for t1/t2.. fields an Alias like

t1.CODE_CLOCK_ID as CODE_CLOCK_ID_s26,
t1.COMPANY_CD as COMPANY_CD_s26,
t1.GROUP_SEGMENT_L1 as GROUP_SEGMENT_L1_s26,
t1.GROUP_SEGMENT_CD as GROUP_SEGMENT_CD_s26,
t2.CODE_CLOCK_ID asCODE_CLOCK_ID_s27,
t2.COMPANY_CD as COMPANY_CD_s27,
t2.GROUP_SEGMENT_L1 as GROUP_SEGMENT_L1_s27,
t2.GROUP_SEGMENT_CD as GROUP_SEGMENT_CD_s27

 

Greets

 

View solution in original post

avatar
Rising Star

Hi, Faerballert,

Really it did work, thank you very much.

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

maybe you should set for all or just for t1/t2.. fields an Alias like

t1.CODE_CLOCK_ID as CODE_CLOCK_ID_s26,
t1.COMPANY_CD as COMPANY_CD_s26,
t1.GROUP_SEGMENT_L1 as GROUP_SEGMENT_L1_s26,
t1.GROUP_SEGMENT_CD as GROUP_SEGMENT_CD_s26,
t2.CODE_CLOCK_ID asCODE_CLOCK_ID_s27,
t2.COMPANY_CD as COMPANY_CD_s27,
t2.GROUP_SEGMENT_L1 as GROUP_SEGMENT_L1_s27,
t2.GROUP_SEGMENT_CD as GROUP_SEGMENT_CD_s27

 

Greets

 

avatar
Rising Star

Hi, Faerballert,

Really it did work, thank you very much.

avatar
Contributor

Hi

I want to extract a CSV file from SQL query

one of the columns should be present in the CSV twice, diffirent position.

I had the same error

is there a workaround ?

Thank you