Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Skew issue in spark sql

avatar
New Member

Hi,

I am joining two tables. One table is skewed. How to handle this in spark SQL. I am using spark 2.2.1 in AWS EMR.

Please assist on this.

1 ACCEPTED SOLUTION

avatar

@elango vaithiyanathan

Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)

Or else you could build an artificial (numeric) column by salting, and partition by this column.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

2 REPLIES 2

avatar

@elango vaithiyanathan

Perhaps you could pick another way to partition your data, by different column where the distribution of data is split evenly (hopefully)

Or else you could build an artificial (numeric) column by salting, and partition by this column.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
New Member

@Felix Albani Thank you.