Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to repartition an existing RDD which is already partioned ?

Highlighted

Is it possible to repartition an existing RDD which is already partioned ?

New Contributor
 
3 REPLIES 3

Re: Is it possible to repartition an existing RDD which is already partioned ?

@Sudharsan Ganeshkumar

Yes, you can repartition the rdd which is already partitioned. Just use .repartition

Re: Is it possible to repartition an existing RDD which is already partioned ?

New Contributor

@Sandeep Nemuri Ya, i m able to use repartition for a different RDD. Is it possible to repartition the same RDD ?

Re: Is it possible to repartition an existing RDD which is already partioned ?

@Sudharsan Ganeshkumar

RDD are immutable so when you repartition or coalesce it always creates a new RDD. When possible it's preferred to use coalesce since it could avoid shuffling (which is always expensive task). You can read more here:

https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-rdd-partitions.html

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Don't have an account?
Coming from Hortonworks? Activate your account here