Member since
09-15-2018
61
Posts
6
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3012 | 04-17-2020 08:40 AM | |
14980 | 04-14-2020 04:45 AM | |
2320 | 04-14-2020 03:12 AM | |
1668 | 10-17-2019 04:47 AM | |
2433 | 10-17-2019 04:33 AM |
04-17-2020
08:40 AM
Hey @sharathkumar13, Thanks for reaching out to the Cloudera community. >> You can refer to the mentioned Git Repo[1] for information on Kafka exporter for Prometheus in Kafka Manager. [1]https://github.com/danielqsj/kafka_exporter >> I would like to share information on SMM[2], Streams Messaging Manager is an operations monitoring and management tool from Cloudera that provides end-to-end visibility in an enterprise Apache Kafka environment. With SMM, you can gain clear insights about your Kafka clusters. You can understand the end-to-end flow of message streams from producers to topics to consumers. [2]https://docs.cloudera.com/csp/2.0.1/smm-overview/topics/smm-overview.html Let me know if this helps.
... View more
04-14-2020
08:44 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. The commit-id mentioned here isn't related to any Kafka usage related terms such as 'commit offsets' or other terms. This commit id refers to the Kafka source from which it was built. It is not an error but just an info message. This doesn't impact Kafka client's functionality in any way. Let me know if this helps. Cheers,
... View more
04-14-2020
05:45 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. This issue is due to the missing "kafka-python" module in your Python installation. You have to manually install the "kafka-python" module using the mentioned command in the edge node and all the hosts on which Spark job executes. $ pip install kafka-python
... View more
04-14-2020
04:45 AM
1 Kudo
Hey @GTA, Thanks for reaching out to the Cloudera community. "Required executor memory (1024), overhead (384 MB), and PySpark memory (0 MB) is above the max threshold (1024 MB) of this cluster!" >> This issue occurs when the total memory required to run a spark executor in a container (Spark executor memory -> spark.executor.memory + Spark executor memory overhead: spark.yarn.executor.memoryOverhead) exceeds the memory available for running containers on the NodeManager (yarn.nodemanager.resource.memory-mb) node. Based on the above exception you have 1 GB configured by default for a spark executor, the overhead is by default 384 MB, the total memory required to run the container is 1024+384 MB = 1408 MB. As the NM was configured with not enough memory to even run a single container (only 1024 MB), this resulted in a valid exception. Increasing the NM settings from 1251 to 2048 MB will definitely allow a single container to run on the NM node. Use the mentioned steps to increase "yarn.nodemanager.resource.memory-mb" parameter to resolve this. Cloudera Manager >> YARN >> Configurations >> Search "yarn.nodemanager.resource.memory-mb" >> Configure 2048 MB or higher >> Save & Restart. Let me know if this helps.
... View more
04-14-2020
03:43 AM
1 Kudo
Hey @Deep_live, Apologies, I'm unable to locate a cached/archived CDH 5.x quickstart VM image.
... View more
04-14-2020
03:12 AM
2 Kudos
Hey, The Cloudera Quickstart VM has been discontinued for CDH 5.x & 6.x by Cloudera. >> You can try a docker image of Cloudera available publicly on https://hub.docker.com/r/cloudera/quickstart or simply run below command to download this on docker enabled system. $ docker pull cloudera/quickstart
... View more
10-18-2019
05:52 AM
Hey, Thank you for sharing the outcome and the steps. Much appreciated. Regards.
... View more
10-17-2019
06:52 AM
Hey, This exception might encounter if jline version isn't in sync with the scala version. What is your current Scala version? Regards, Ankit.
... View more
10-17-2019
04:47 AM
Hey, Optimizing your Kafka Cluster depends upon your cluster usage & use-case. Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency. Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent). [1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Please let me know if this helps. Regards, Ankit.
... View more
10-17-2019
04:33 AM
Hey, Can you please try setting the SPARK_HOME env variable to the location indicated by the readlink command it launches pyspark and shows Spark 2.0 as the version? For Example: export SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2 By setting SPARK_HOME to the Spark 2 lib folder instead, pyspark2 will then launch and show Spark 2.3.0.cloudera3 as the spark version. Please let me know if this helps. Regards, Ankit.
... View more