Support Questions
Find answers, ask questions, and share your expertise

Integration of Apache Spark and Apache Kafka using PySpark

Integration of Apache Spark and Apache Kafka using PySpark

Explorer

Hello,

I want to send messages from kafka to Spark and then use Spack SQL for from manupulation. Finally i want to send it to another Kafka topic.

When i use 

kvs = KafkaUtils.createStream(ssc,"localhost:2181", "spark-streaming-consumer", {topic:1})
it creates TransformedDStream. Then i am not able to convert it to Dataframes so can use SparkSQL in it.
Later when i refered https://spark.apache.org/docs/2.4.5/structured-streaming-kafka-integration.html and tried the following to get a dataframe
df = spark.readStream.format("kafka").option("kafka.bootstrap.servers","localhost:9092").option("subscribe", "SparkPublish").load()

Even after this i am getting different errors.

Can anyone tell me how can receive a message from kafka using Spark streaming as Dataframes and use SparkSQl on it?

Thank You.

Don't have an account?