Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Cloudera kafka pyspark KafkaProducer (ImportError: No module named kafka)


Need help on Kafka on Cloudera.
I wrote a program in pySpark in PyCharm it works good.


from kafka import KafkaProducer
from kafka.errors import KafkaError
producer = KafkaProducer(bootstrap_servers=[''])
tes = producer.send('my-first-topic', "this message from pyspark")


but when I run in my Linux Cloudera machine I get

File "/home/cloudera/kafka/", line 1, in <module>
from kafka import KafkaProducer
ImportError: No module named kafka


using command spark2-submit



Hey @AndyTech,


Thanks for reaching out to the Cloudera community.


This issue is due to the missing "kafka-python" module in your Python installation. You have to manually install the "kafka-python" module using the mentioned command in the edge node and all the hosts on which Spark job executes.


$ pip install kafka-python

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.