Created 11-08-2018 09:42 AM
Hello,
I have a problem with spark wanting to consume a topic kafka in kerberized environment.
I use SPARK2 in cluster HDP 2.6, and Kafka HDF 3.1.
consumer.py :
import sys from pyspark import SparkContext from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils def process(time, rdd): print("========= %s =========" % str(time)) if not rdd.isEmpty(): rdd.count() rdd.first() sc = SparkContext(appName="test") ssc = StreamingContext(sc, 5) sc.setLogLevel("WARN") print "Connected to spark streaming" kafkaParams = {"security.protocol":"PLAINTEXTSASL"} kafkaStream = KafkaUtils.createStream(ssc, "zookeeperHDF:2181", "pysparkclient1", {"mytopic": 1},kafkaParams) kafkaStream.pprint() kafkaStream.foreachRDD(process) ssc.start() ssc.awaitTermination()
Command line :
/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --keytab /etc/security/keytabs/clement.service.keytab --principal clement@MYDOMAIN.FR --files /home/spark/test/clement_client_jaas.conf --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --jars spark-streaming-kafka-0-8_2.11-2.2.0.jar,spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-assembly_2.11-1.6.3.jar consumer.py
clement_client_jaas.conf :
KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true useKeyTab=true principal="clement@MYDOMAIN.FR" keyTab="clement.service.keytab" renewTicket=true storeKey=true serviceName="kafka"; }; Client { com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true useKeyTab=true principal="clement@MYDOMAIN.FR" keyTab="clement.service.keytab" renewTicket=true storeKey=true serviceName="zookeeper"; };
My error :
18/11/08 10:26:57 WARN VerifiableProperties: Property security.protocol is not valid 18/11/08 10:26:57 WARN ClientCnxn: SASL configuration failed: javax.security.auth.login.LoginException: No key to store Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it. 18/11/08 10:26:57 WARN AppInfo$: Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException
Do you have an idea ?
Created 11-08-2018 09:47 AM
@Clément Dumont You also need to pass the keytab in '--files' option so that the jaas conf will use that keytab and connect with kafka.
Created 11-08-2018 12:27 PM
Thks,
I tried this :
/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --keytab clement.service.keytab --principal clement@MYDOMAIN.FR --files clement_client_jaas.conf,clement.service.keytab --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --jars spark-streaming-kafka-0-8_2.11-2.2.0.jar,spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-assembly_2.11-1.6.3.jar consumer.py
But i have the same error :
18/11/08 13:26:14 WARN VerifiableProperties: Property security.protocol is not valid 18/11/08 13:26:14 WARN ClientCnxn: SASL configuration failed: javax.security.auth.login.LoginException: No key to store Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it. 18/11/08 13:26:14 WARN AppInfo$: Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException
Created 11-09-2018 12:44 PM
@Clément Dumont I just tired it, looks like you have one jar from apache which is not recognising the security.protocol property. Below is what i used with the consumer.py you provided. You can download the dependent jars from : http://repo.hortonworks.com/content/repositories/releases/org/apache/spark
/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --files spark_jaas.conf,kafka.service.keytab --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=spark_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=spark_jaas.conf" --jars spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-0-8-assembly_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-0-10_2.11-2.2.0.2.6.4.0-91.jar consumer.py
Hope this helps.
Created 11-13-2018 02:44 PM
It work ! thanks.