We have a big customer table with 7 million records and we are trying to process some transaction data (500K messages per batch) coming from the kafka stream.
During the processing, we need to join the transaction data with customer data. This is currently taking us around 10s and the requirement is to bring it down to 5s. Since the customer table is too large, we cannot use broadcast join. Is there any other optimization that we can make?