Member since
09-15-2018
61
Posts
6
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3203 | 04-17-2020 08:40 AM | |
15648 | 04-14-2020 04:45 AM | |
2538 | 04-14-2020 03:12 AM | |
1810 | 10-17-2019 04:47 AM | |
2640 | 10-17-2019 04:33 AM |
07-31-2020
08:14 AM
I used to work at Cloudera/Hortonworks, and now I am a Hashmap Inc. consultant. This solution worked perfectly, thank you.
... View more
04-18-2020
07:43 AM
Thank you @TonyStank . This helps me.
... View more
04-14-2020
08:44 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. The commit-id mentioned here isn't related to any Kafka usage related terms such as 'commit offsets' or other terms. This commit id refers to the Kafka source from which it was built. It is not an error but just an info message. This doesn't impact Kafka client's functionality in any way. Let me know if this helps. Cheers,
... View more
04-14-2020
05:45 AM
Hey @AndyTech, Thanks for reaching out to the Cloudera community. This issue is due to the missing "kafka-python" module in your Python installation. You have to manually install the "kafka-python" module using the mentioned command in the edge node and all the hosts on which Spark job executes. $ pip install kafka-python
... View more
04-14-2020
04:09 AM
@TonyStank appreciate your help. Stay safe.
... View more
10-18-2019
05:52 AM
Hey, Thank you for sharing the outcome and the steps. Much appreciated. Regards.
... View more
10-17-2019
05:47 AM
thanks, that put me in the right direction for completeness, just setting SPARK_HOME was not sufficient, it was missing py4j setting PYTHONPATH fixed that issue export SPARK_HOME=/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2 export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH Now pyspark shows: version 2.3.0.cloudera3
... View more
10-17-2019
04:47 AM
Hey, Optimizing your Kafka Cluster depends upon your cluster usage & use-case. Based on your main concern like throughput or CPU utilization or Memory/Disk usage, you need to modify different parameters and some changes may have an impact on other aspects. For example, if acknowledgments is set to "all", all brokers that replicate the partitions need to acknowledge that the data was written prior to confirming the next message needs to be sent. This will ensure data consistency but increase CPU utilization and network latency. Refer Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) article[1] written by Jay Kreps(Co-founder and CEO at Confluent). [1]https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Please let me know if this helps. Regards, Ankit.
... View more
09-12-2019
03:22 AM
I didnt use FQDN, instead i just added ip in /etc/hosts file. i used the same host ip in the kafka config
... View more
02-13-2019
10:07 PM
Hi Tony, Thanks for your reply, appericate all the hep provided by you and Gzigldrum Regards Wert
... View more