Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Correlated subquery gives wrong result back

Correlated subquery gives wrong result back

Explorer

Hello,

 

I have got this query.

 

select myTable1.ILDOC, myTable1.PRDOC 
from myTable as myTable1
where myTable1.ildoc > isnull
(
   (
    SELECT 
     MAX(myTable2.PRDOC)
    FROM 
     myTable AS myTable2 
    WHERE 
     myTable2.PRMATC = myTable1.PRMATC 
     AND myTable2.PRDOCO = myTable1.PRDOCO 
     AND myTable2.PRDCTO = myTable1.PRDCTO 
     AND myTable2.PRKCOO = myTable1.PRKCOO 
     AND myTable2.PRSFXO = myTable1.PRSFXO 
     AND myTable2.PRLNID = myTable1.PRLNID 
     AND myTable2.PRNLIN = myTable1.PRNLIN 
     AND myTable2.PRDOC < myTable1.PRDOC 
    ),
   0.0
)
order by myTable1.PRDOC, myTable1.ILDOC

 

 

And I've got this data set for it

ildoc,prdoc,prmatc,prdoco,prdcto,prkcoo,prsfxo,prlnid,prnlin
"3,089","4,000",4,"4,501",OP,00001,000,"1,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"1,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"2,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"2,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"3,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"3,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"4,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"4,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"5,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"5,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"7,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"7,000",1
"3,089","4,000",4,"4,501",OP,00001,000,"8,000",1
"3,089","4,001",1,"4,501",OP,00001,000,"8,000",1
"4,011","4,013",1,"4,410",OP,00001,000,"1,000",1
"4,033","7,601",2,"7,601",OP,07600,000,"3,000",1
"4,083","4,084",1,"4,735",OP,00001,000,"1,000",1
"4,103","4,272",4,"4,746",OP,00200,000,"1,000",1
"4,103","4,270",4,"4,746",OP,00200,000,"1,000",1
"4,103","4,184",4,"4,746",OP,00200,000,"1,000",1
"4,103","4,274",1,"4,746",OP,00200,000,"1,000",1
"4,105","4,221",4,"4,747",OP,00200,000,"1,000",1
"4,105","4,276",1,"4,747",OP,00200,000,"1,000",1
"4,108","4,259",4,"4,748",OP,00200,000,"1,000",1
"4,108","4,308",1,"4,748",OP,00200,000,"1,000",1
"4,107","4,259",4,"4,748",OP,00200,000,"1,000",1
"4,107","4,308",1,"4,748",OP,00200,000,"1,000",1
"4,110","4,259",4,"4,748",OP,00200,000,"1,000",1
"4,110","4,308",1,"4,748",OP,00200,000,"1,000",1
"4,112","4,234",4,"4,749",OP,00200,000,"1,000",1
"4,112","4,295",1,"4,749",OP,00200,000,"1,000",1
"4,113","4,234",4,"4,749",OP,00200,000,"2,000",1
"4,113","4,296",1,"4,749",OP,00200,000,"2,000",1
"4,115","4,236",4,"4,750",OP,00200,000,"1,000",1
"4,115","4,298",1,"4,750",OP,00200,000,"1,000",1
"4,117","4,238",4,"4,751",OP,00200,000,"1,000",1
"4,117","4,300",1,"4,751",OP,00200,000,"1,000",1
"4,119","4,224",4,"4,752",OP,00200,000,"1,000",1
"4,119","4,227",1,"4,752",OP,00200,000,"1,000",1
"4,120","4,224",4,"4,752",OP,00200,000,"1,000",1
"4,120","4,227",1,"4,752",OP,00200,000,"1,000",1
"4,127","4,229",4,"4,757",OP,00200,000,"1,000",1
"4,127","4,232",1,"4,757",OP,00200,000,"1,000",1
"4,128","4,229",4,"4,757",OP,00200,000,"1,000",1
"4,128","4,232",1,"4,757",OP,00200,000,"1,000",1
"4,130","4,242",4,"4,758",OP,00200,000,"1,000",1
"4,130","4,304",1,"4,758",OP,00200,000,"1,000",1
"4,137","4,244",4,"4,762",OP,00200,000,"1,000",1
"4,137","4,306",1,"4,762",OP,00200,000,"1,000",1
"4,145","4,199",4,"4,767",OP,00200,000,"1,000",1
"4,145","4,283",1,"4,767",OP,00200,000,"1,000",1
"4,147","4,248",4,"4,768",OP,00200,000,"1,000",1
"4,147","4,310",1,"4,768",OP,00200,000,"1,000",1
"4,149","4,250",4,"4,769",OP,00200,000,"1,000",1
"4,149","4,312",1,"4,769",OP,00200,000,"1,000",1
"4,152","4,252",4,"4,771",OP,00200,000,"1,000",1
"4,152","4,314",1,"4,771",OP,00200,000,"1,000",1
"4,157","4,254",4,"4,773",OP,00200,000,"1,000",1
"4,157","4,317",1,"4,773",OP,00200,000,"1,000",1
"4,158","4,254",4,"4,773",OP,00200,000,"2,000",1
"4,158","4,318",1,"4,773",OP,00200,000,"2,000",1
"4,160","4,256",4,"4,774",OP,00200,000,"1,000",1
"4,160","4,320",1,"4,774",OP,00200,000,"1,000",1
"4,162","4,211",4,"4,775",OP,00200,000,"1,000",1
"4,162","4,288",1,"4,775",OP,00200,000,"1,000",1
"4,163","4,211",4,"4,775",OP,00200,000,"1,000",1
"4,163","4,288",1,"4,775",OP,00200,000,"1,000",1
"4,165","4,267",4,"4,776",OP,00200,000,"1,000",1
"4,165","4,293",1,"4,776",OP,00200,000,"1,000",1
"4,169","4,265",4,"4,770",OP,00200,000,"1,000",1
"4,169","4,291",1,"4,770",OP,00200,000,"1,000",1
"4,269","4,272",4,"4,746",OP,00200,000,"1,000",1
"4,269","4,270",4,"4,746",OP,00200,000,"1,000",1
"4,269","4,274",1,"4,746",OP,00200,000,"1,000",1
"4,271","4,272",4,"4,746",OP,00200,000,"1,000",1
"4,271","4,274",1,"4,746",OP,00200,000,"1,000",1

 

When I run this query against this data on Impala, I get 3 records back. Running this on other databases I get 73 record back. I tried hard to pinpoint the problem to something smaller than this, but I couldnt. Appologies for the size.

 

I assume the query can be rewritten, but thats not an option for me as it is generated by a larger system, which I cannot change. Is this a known issue, is it even fixed?

 

I am using Impala Version: impalad version 2.5.0-cdh5.7.1 RELEASE (build 27a4325c18c2a01c7a8097681a0eccf6d4335ea1)

 

Any help much appreciated.

5 REPLIES 5

Re: Correlated subquery gives wrong result back

Explorer

Bumping up the issue in hope someone can help me on this one. Anyone any idea?

Re: Correlated subquery gives wrong result back

Explorer

I would like to qualify what is going wrong here.

 

If I run the query without the condition that contains the correlated subquery, it returns 76 records.

If I run the query with that condition I get 3 records.

If I negate that condition I also get 3 records.

 

This is inconsistent. In SQL terms the number of records without the condition should equal to the number of records with the condition plus the number of records with the negated condition. In this case, I get 76 != 3 + 3.

 

The wrong query is the one that is specified above, as it should return 73 records.

Highlighted

Re: Correlated subquery gives wrong result back

Contributor
This looks like a real bug to me. I'd suggest you file it at issues.cloudera.org.

Re: Correlated subquery gives wrong result back

Master Collaborator

Thanks for the report and my apologies for the late response. This is a new bug.

 

I've filed https://issues.cloudera.org/browse/IMPALA-4373 to track progress on it.

 

I can't think of a workaround at the moment, but will post one on the JIRA if I can come up with one (but it will certainly involve changing the SQL).

Re: Correlated subquery gives wrong result back

Explorer

Thank you, much appreciated.

 

I will check regulary to monitor the progress.