Created 03-22-2023 02:21 PM
i'm studing AirFlow, Kafka, Spark and Docker.
I'm trying to run this project here https://github.com/KumarRoshandot/AirFlow_Kafka_Spark_Docker/tree/master/Project_Flight_Docker2
But my task is always returning an error:
Error: Cannot load main class from JAR org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1 with URI org.apache.spark. Please specify a class through --class.
This is my docker compose
And here is my airflow docker.
What am i doig wrong?
Created 03-22-2023 02:49 PM
@gizelly Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Kafka experts @paras and @rki_ who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 03-30-2023 02:31 AM
Hi @gizelly
Based on the above exception, spark-sql-kafka library is not added to the classpath. You need to add that jar to class path or build the fat jar.
For your spark-submit command add the jar using following command:
./bin/spark-submit \
--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1 ...
Created 04-05-2023 11:48 PM
Hi @gizelly
Have you tried above solution and its worked? If yes, please accept as Solution. It will useful to other members.