question Re: kudu scan very slow in Support Questions

question Re: kudu scan very slow in Support Questions https://community.cloudera.com/t5/Support-Questions/kudu-scan-very-slow/m-p/85498#M11687 One thing that is clearly happening here is that Kudu is sending much more data than is necessary back to Impala. You specified LIMIT 7, but Kudu doesn't support server-side limits until CDH6.1. For such a small query, this might make things quite a bit faster. Beyond that, I honestly don't see anything suspicious about the numbers from the Kudu side other than the scan took a long time given the amount of work involved. The trace shows nothing out of the ordinary; the metrics are fine. There weren't even cache misses, so everything came out of cache decoded + decompressed. From the Impala profile, the only suspicious-looking thing is that two round trips were required. That shouldn't have been the case with LIMIT 7 as the first batch should have had more than 7 records in it. If you run the scan a couple times in a row, does it get much faster? How does the time vary with the LIMIT amount? Fri, 25 Jan 2019 23:18:05 GMT wdberkeley 2019-01-25T23:18:05Z