Member since
05-31-2017
37
Posts
9
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
965 | 08-25-2023 10:48 AM | |
1461 | 05-24-2023 11:21 AM | |
3866 | 05-09-2022 02:54 PM | |
3059 | 04-28-2022 11:27 AM | |
1474 | 04-19-2022 11:03 AM |
10-26-2023
01:53 PM
CDP 7.1.7 supports Spark 3.2.3 not Spark 3.3 https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/cds-3/topics/spark-3-requirements.html
... View more
08-25-2023
10:48 AM
Spark 3.3 can be installed on CDP 7.1.8 and higher. Here is the document for pre-req and installation steps https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-3-requirements.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-install-spark-3-parcel.html
... View more
05-24-2023
11:21 AM
1 Kudo
At the moment both uber rss and apache uniffle are not supported in CDP. Dynamic resource allocation requires an external shuffle service that runs on each worker node as an auxiliary service of NodeManager. This service is started automatically; no further steps are needed. spark.shuffle.service.enabled=true enables the external shuffle service. The external shuffle service preserves shuffle files written by executors so that the executors can be deallocated without losing work. Must be enabled if dynamic allocation is enabled.
... View more
10-19-2022
10:48 AM
1 Kudo
AFAIU It should not matter. You can choose the rolling restart option as well. So CM can decide the sequence of the broker restarts.
... View more
05-09-2022
02:54 PM
@bluespring One should not be deleting the offline/online partitions that may cause in data loss or under-replicated partitions. You may reassign the partitions to new hosts following the document below: https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/kafka-managing/topics/kafka-manage-cli-reassign-overview.html
... View more
04-28-2022
01:55 PM
@clouderaskme The latest CDP 7.1.7 comes with the default Spark 2.4 version. https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/runtime-release-notes/topics/rt-pvc-runtime-component-versions.html Spark 2.4 supports Python 2.7 and 3.4-3.7. https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/release-guide/topics/cdpdc-os-requirements.html
... View more
04-28-2022
11:27 AM
@clouderaskme Please review the documents below that provides the details on requirements for Spark3.2, 3.1 and 3.0 https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/cds-3/topics/spark-3-requirements.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/cds-3/topics/spark-spark-3-requirements.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.4/cds-3/topics/spark-spark-3-requirements.html Cloudera Distributed Spark 3.2 requires Python 3.6+ and requires CDP 7.1.7 and higher Cloudera Distributed Spark 3.1 requires Python 3.6+ and requires CDP 7.1.7 and higher Cloudera Distributed Spark 3.0 requires Python 3.4 or higher. and requires CDP7.1.3, 7.1.4 and 7.1.5
... View more
04-19-2022
11:03 AM
1 Kudo
@Sayed016 Thank you for your question. From the error stack of CM logs, it looks like it tries to copy the system database and information_schema. You need to exclude the system database and information_schema Add the following exclusion on a Hive replication: Databases: (?!information_schema|sys\b).+ Tables: [\w].+
... View more
02-16-2022
03:18 PM
2 Kudos
Please follow the steps below: SSH to Cloudera Manager host where the Spark 3 CSDs are deployed Find the following files and use a file manager (for example mc) or an editor to open them as zip files and edit the contents of "descriptor/service.sdl". Probably the easiest way is to open the jar files with vim: $ vim /opt/cloudera/csd/SPARK3_ON_YARN-3.2.0.3.2.7170.0-49.jar $ vim /opt/cloudera/csd/LIVY_FOR_SPARK3-0.6.3000.3.2.7170.0-49.jar In the descriptor/service.sdl files, prefix the version with something that is higher than the CM version number, so instead of: "version" : "3.0.7110.0", add "version" : "7.5.4.3.0.7110.0", 4. Restart CM server and wait until it comes back up: $ service cloudera-scm-server restart Spark 3 can now be installed. Once installed, deploy client config and restart all services that have stale configs (most importantly YARN).
... View more
02-11-2022
01:21 PM
1 Kudo
1) Please download the SPARK3_ON_YARN-3.2.0.3.2.7170.0-49.jar from link below: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/cds-3/topics/spark-spark-3-packaging.html 2) copy the jar into /op/cloudera/csd on CM server host. Make sure there are no old CSDs in this directory. Delete all old ones. 3) Set the file ownership of the service descriptor to cloudera-scm:cloudera-scm with permission 644. 4) Restart the Cloudera Manager Server 5) Make sure the parcel from the link below is downloaded, distributed, and activated https://archive.cloudera.com/p/spark3/3.2.7170.0/parcels/ 6) After all these steps then from CM > Cluster > Drop don menu > Add Service > Spark 3
... View more