04-12-2017 08:10 AM
Oh and in case anyone else reads this and wants to know if they're hitting a similar issue, you can tell that the sort will be slow because it's using a lot of memory - 47.86GB. It takes a while to sort that much data.
04-12-2017 09:00 PM - edited 04-12-2017 09:10 PM
Column a.t_date is a string field, not a timestamp field. The two tables are Parquet file format.
By adding more nodes into the cluster, more Impala Daemon are running, can we aspect the performance for such query will be improve?