03-11-2018 04:52 PM
In order to prevent network traffic, I'd like to perform local joins on each node instead of exchanging the data and perform a join over the complete data afterwards. My query is basically a join over three three tables on an ID attribute. The blocks are perfectly distributed, so that e.g. Table A - Block 0 and Table B - Block 0 are on the same node. These blocks contain all data rows with an ID range [0,1]. Table A - Block 1 and Table B - Block 1 with an ID range [2,3] are on another node etc. So I want to perform a local join per node because any data exchange would be unneccessary (except for the last step when the final node recevieves all results of the other nodes).
At the moment the query plan looks like follows, although the blocks are already perfectly distributed.
I would like to skip the exchange steps and perform the joins locally. Is this possible?