Member since
05-31-2017
37
Posts
9
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
969 | 08-25-2023 10:48 AM | |
1485 | 05-24-2023 11:21 AM | |
3882 | 05-09-2022 02:54 PM | |
3070 | 04-28-2022 11:27 AM | |
1474 | 04-19-2022 11:03 AM |
10-30-2023
10:40 PM
@AK- Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more
08-28-2023
01:12 PM
@GowthamSenthil Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
... View more
05-29-2023
11:28 PM
thanks for the information, we're running on yarn so I guess the steps mentioned in https://spark.apache.org/docs/3.0.0-preview2/running-on-yarn.html#configuring-the-external-shuffle-service must be executed to configure the external shuffle service on each worker node. Are there any plans to support any other external shuffle service like uber rss or apache uniffle in the future?
... View more
10-19-2022
11:13 AM
@DataMike Yes, you can use the CC APIs as per your requirements [1] [2] [1] https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/cctrl-managing/topics/cctrl-using-rest-api.html [2] https://community.cloudera.com/t5/Customer/Frequently-Used-CRUISE-CONTROL-API-and-important-DOCs/ta-p/324729 From the above articles you can use the following API to rebalance topics/partitions: curl -k --negotiate -u: -X POST "https://<CC FQDN>:8899/kafkacruisecontrol/rebalance?dryrun=false&rebalance_disk=true" To avoid high CPU, memory and disk read write you can initiate rebalance process in batches. It will automatically create a batch and rebalance topics. curl -X POST "http://$HOSTNAME:8899/kafkacruisecontrol/rebalance?dryrun=true&concurrent_partition_movements_per_broker=10&concurrent_leader_movements=500" If it will help you then please click on ”Accept as Solution" below this post. Thank you.
... View more
04-28-2022
01:55 PM
@clouderaskme The latest CDP 7.1.7 comes with the default Spark 2.4 version. https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/runtime-release-notes/topics/rt-pvc-runtime-component-versions.html Spark 2.4 supports Python 2.7 and 3.4-3.7. https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/release-guide/topics/cdpdc-os-requirements.html
... View more
04-19-2022
04:06 PM
@Bharati Thank you! This worked. However, could you please share which logs had shown that it was trying to copy the system database and information_schema?
... View more
01-09-2020
11:20 AM
1. CDH5 works with Spark2 by installing a separate parcel. 2. The document suggests that CDH5 + Spark2 does not support Spark-On-HBase. Spark-On-HBase work with CDH 5 and Spark 1.6 3. With CDH 6.x default Spark version is Spark 2.x+ so as a new feature Spark-On-HBase https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_600_new_features.html#spark_new_features Spark On HBase now runs on top of Apache Spark 2.x. 4. Refer document below to use Spark-on-HBase on CDH6.x: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_hbase_import.html#concept_asc_ctz_wp
... View more