my cluster is cdh5.14 impala 2.11 kudu 1.6.0
impala on kudu,When union all is used to merge more than three select join queries, the third and later select must perform full table scanning. How can I solve the problem of full table scanning?
A full table scan is not performed when the select join statement is executed alone.
sql as follows:
SELECT
mb.unitno,
mb.materialcode,
mb.starttime,
mb.length,
ib.equipmentcode,
ib.defectclass,
ib.defectname,
ib.side
FROM
sc1.tba1 mb
LEFT OUTER JOIN
sc1.tbb1 ib
ON
mb.materialcode = ib.materialcode
where mb.starttime>'20240827'and mb.starttime<'20240828' )
UNION ALL
SELECT
md.unitno,
md.materialcode,
md.starttime,
md.length,
id.equipmentcode,
id.defectclass,
id.defectname,
id.side
FROM
sc1.tba2 md
LEFT OUTER JOIN
sc1.tbb2 id
ON
md.materialcode = id.materialcode
where md.starttime>'20240827'and md.starttime<'20240828' )
UNION ALL
SELECT
me.unitno,
me.materialcode,
me.starttime,
me.length,
ie.equipmentcode,
ie.defectclass,
ie.defectname,
ie.side
FROM
sc1.tba3 me
LEFT OUTER JOIN
sc1.tbb3 ie
ON
me.materialcode = ie.materialcode
where me.starttime>'20240827'and me.starttime<'20240828' );
The number of tbb1, tbb2, and tbb3 rows is 800 million,The number of tba1, tba2, and tbba3 rows is 10000. When executing the above sql, a full table scan is performed on sc1.tbb3.