Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Skew issue in spark sql

avatar
Explorer

Hi,

I am joining two tables. One table is skewed. How to handle this in spark SQL. I am using spark 2.2.1 in AWS EMR.

Please assist on this.

1 ACCEPTED SOLUTION

avatar

@elango vaithiyanathan

Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)

Or else you could build an artificial (numeric) column by salting, and partition by this column.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

2 REPLIES 2

avatar

@elango vaithiyanathan

Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)

Or else you could build an artificial (numeric) column by salting, and partition by this column.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
Explorer

@Felix Albani Thank you.