I have a very simple query:
create table test2 as select * from mytable F where F.scandate_key between 20160316 and 20160318
the column scandate_key is the partition key.
It's taking a very long time for above to run. The 3 partitions above only have less than 4 million rows total.
Here's what I noticed, 89% of data was read remote..Is this why it's slow?
Have you been able to solve your issue? Remote reads can indeed slow down query execution. What file format is your data in?
yes I resolved it...turns out one of the column had values with up to 60K characters...we truncated most of it and it wasn't needed at all.