- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Support for external shuffle services
- Labels:
-
Apache Spark
-
Apache YARN
Created 05-24-2023 02:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, does anyone knows whether there are any restrictions on using external shuffle services on CDP?
For instance, are the following external shuffle services supported?:
- uber rss
- apache uniffle
Created 05-24-2023 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
At the moment both uber rss and apache uniffle are not supported in CDP.
Dynamic resource allocation requires an external shuffle service that runs on each worker node as an auxiliary service of NodeManager. This service is started automatically; no further steps are needed. spark.shuffle.service.enabled=true enables the external shuffle service. The external shuffle service preserves shuffle files written by executors so that the executors can be deallocated without losing work. Must be enabled if dynamic allocation is enabled.
Created 05-24-2023 08:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@dbrys Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Spark experts @Bharati and @jagadeesan who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 05-24-2023 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
At the moment both uber rss and apache uniffle are not supported in CDP.
Dynamic resource allocation requires an external shuffle service that runs on each worker node as an auxiliary service of NodeManager. This service is started automatically; no further steps are needed. spark.shuffle.service.enabled=true enables the external shuffle service. The external shuffle service preserves shuffle files written by executors so that the executors can be deallocated without losing work. Must be enabled if dynamic allocation is enabled.
Created 05-29-2023 11:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for the information, we're running on yarn so I guess the steps mentioned in https://spark.apache.org/docs/3.0.0-preview2/running-on-yarn.html#configuring-the-external-shuffle-s... must be executed to configure the external shuffle service on each worker node. Are there any plans to support any other external shuffle service like uber rss or apache uniffle in the future?
