Hi,
I have two table with 4 million and 200 million records each, both in ORC format.
Assuming the smaller table does not fit in memory, is SMB join the best way to go about it, if I have to join and match these tables?
If there are newer partitions every day adding to the bigger table, what optimizations can we do on this?
Thanks in advance.