Created 08-28-2018 11:43 AM
Hi,
I am joining two tables. One table is skewed. How to handle this in spark SQL. I am using spark 2.2.1 in AWS EMR.
Please assist on this.
Created 08-28-2018 12:10 PM
Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)
Or else you could build an artificial (numeric) column by salting, and partition by this column.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 08-28-2018 12:10 PM
Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)
Or else you could build an artificial (numeric) column by salting, and partition by this column.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 08-28-2018 12:22 PM
@Felix Albani Thank you.