Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Support for external shuffle services

avatar
New Contributor

Hi, does anyone knows whether there are any restrictions on using external shuffle services on CDP?

For instance, are the following external shuffle services supported?:

- uber rss

- apache uniffle

1 ACCEPTED SOLUTION

avatar
Rising Star

At the moment both uber rss and apache uniffle are not supported in CDP.

Dynamic resource allocation requires an external shuffle service that runs on each worker node as an auxiliary service of NodeManager. This service is started automatically; no further steps are needed. spark.shuffle.service.enabled=true enables the external shuffle service. The external shuffle service preserves shuffle files written by executors so that the executors can be deallocated without losing work. Must be enabled if dynamic allocation is enabled.

View solution in original post

3 REPLIES 3

avatar
Community Manager

@dbrys Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Spark experts @Bharati and @jagadeesan  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Rising Star

At the moment both uber rss and apache uniffle are not supported in CDP.

Dynamic resource allocation requires an external shuffle service that runs on each worker node as an auxiliary service of NodeManager. This service is started automatically; no further steps are needed. spark.shuffle.service.enabled=true enables the external shuffle service. The external shuffle service preserves shuffle files written by executors so that the executors can be deallocated without losing work. Must be enabled if dynamic allocation is enabled.

avatar
New Contributor

thanks for the information, we're running on yarn so I guess the steps mentioned in https://spark.apache.org/docs/3.0.0-preview2/running-on-yarn.html#configuring-the-external-shuffle-s... must be executed to configure the external shuffle service on each worker node. Are there any plans to support any other external shuffle service like uber rss or apache uniffle in the future?