Anyone having a chance to look at this. We've really clamped down on our stats so we have them ready, but still see poor performance. We have more queries we're testing where data volume is low and it still drags along withour CPU and network not being anywhere near maxed out.
Are there settings we should be checking that control or limit concurrency/threads/procs/etc?
Yeah, it's not immediately obvious where the time is going from the query profile. As you pointed out, CPU time is not the issue here.
Have you tried watching query progress from the impala web ui, or with "set live_summary=1" in impala-shell? That might give some more idea where the query is spending all of the time?
So we've been looking at the visual profiles on the impala daemon which we were loving even prior to this release... It's just been hard for us to sort out what's actionable. EG I've been baby sitting this query, I see a few minutes supposedly being spent in part 06. If I look at the profile in cloudera manager it shows part 10 as the thing taking the most time so far at 18 minutes and that step is DataStreamSender.
We've done stress tests across our hosts, they all can push 10Gige fine, We've done parallel stress tests with iperf and they still hold up. Right now as I debug this we have around 200MB/sec in and 200MB/sec across all hosts in our cluster including the smaller hosts that don't run impala... any two nodes with impala daemons can push much more than that between each other, so it really makes no sense that we're seeing such poor throughput at the DataStreamSender stage.
Load is a bit high on some of these workers (34 on a 24 core machine), but wio is around 0.1% to 0.5% and idle is > 80%. So perhaps there still is some waiting on network calls but all I can think is that between 2.2 and 2.3 some aspect of how queries get processed made them so chatty that despite crazy low network utilization they're still blocked on network?
Are there other things we can do to debug what's happening in DataStreamSender? I'm going to try increasing log verbosity and maybe something will pop out but further pointers would be great.
DataStreamSender (dst_id=10) (18.3m)
BytesSent: 49.9 MiB
NetworkThroughput(*): 10.6 KiB/s
OverallThroughput: 46.7 KiB/s
PeakMemoryUsage: 460.0 KiB
UncompressedRowBatchSize: 128.7 MiB
We have another query that does a non-equijoin of two small tables (8.8 MB and 6.8 MB) running for 1.5 hours.
Both tables have table stats computed. I think we see very poor performance whenever there's a broadcast in the query.
Here's the profile output: https://gist.github.com/birdychang/b619859c342e5ccc5f9c
For now we rolled back our cluster. We're going to try to setup a smaller dev cluster and repro this. If we can repro it then we can do a before / after profile of the simple test case and give more details here.
Thanks, any info you can provide is good, we'd like to get to the bottom of this.
We took another look at the query profile and it looks like there is a system call being executed per row for this expression:
from_unixtime(unix_timestamp(now() - interval 365 days)
This would be happening in the scan, which slows the progress of the pipeline. In the profile this shows up under HDFS_SCAN_NODE (id=0) with - ScannerThreadsTotalWallClockTime: 1376327192424 and - ScannerThreadsVoluntaryContextSwitches: 429277 .
If so, then why does it make performace difference between 5.4.5 and 5.5.1?
It seems the user who reported this issue also ran the same query in cdh 5.4.5.
We did performance validation going from CDH5.4 to CDH5.5 so we'd expect that you will see similar or improved performance when you upgrade. Query performance has a huge number of variables so it's impossible to guarantee that for every possible query on every possible data set, so we welcome any reports of your experiences
We treat performance regressions as release blockers, e.g. I spent a bunch of time working on the parquet scanner for CDH5.5 so that it was just as fast (actually a little faster) with nested types support than before: https://issues.cloudera.org/browse/IMPALA-2444