Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Trouble with spark / kafka

Highlighted

Trouble with spark / kafka

Hello,

I have a problem with spark wanting to consume a topic kafka in kerberized environment.

I use SPARK2 in cluster HDP 2.6, and Kafka HDF 3.1.

consumer.py :

import sys
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils


def process(time, rdd):
	print("========= %s =========" % str(time))
	if not rdd.isEmpty():
		rdd.count()
		rdd.first()


sc = SparkContext(appName="test")
ssc = StreamingContext(sc, 5)
sc.setLogLevel("WARN")
print "Connected to spark streaming"
kafkaParams = {"security.protocol":"PLAINTEXTSASL"}
kafkaStream = KafkaUtils.createStream(ssc, "zookeeperHDF:2181", "pysparkclient1", {"mytopic": 1},kafkaParams)
kafkaStream.pprint()
kafkaStream.foreachRDD(process)
ssc.start()
ssc.awaitTermination()	

Command line :

/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --keytab /etc/security/keytabs/clement.service.keytab --principal clement@MYDOMAIN.FR --files /home/spark/test/clement_client_jaas.conf --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --jars spark-streaming-kafka-0-8_2.11-2.2.0.jar,spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-assembly_2.11-1.6.3.jar consumer.py

clement_client_jaas.conf :

KafkaClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useTicketCache=true
  useKeyTab=true
  principal="clement@MYDOMAIN.FR"
  keyTab="clement.service.keytab"
  renewTicket=true
  storeKey=true
  serviceName="kafka";
};
Client {
  com.sun.security.auth.module.Krb5LoginModule required
  useTicketCache=true
  useKeyTab=true
  principal="clement@MYDOMAIN.FR"
  keyTab="clement.service.keytab"
  renewTicket=true
  storeKey=true
  serviceName="zookeeper";
};

My error :

18/11/08 10:26:57 WARN VerifiableProperties: Property security.protocol is not valid
18/11/08 10:26:57 WARN ClientCnxn: SASL configuration failed: javax.security.auth.login.LoginException: No key to store Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
18/11/08 10:26:57 WARN AppInfo$: Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException

Do you have an idea ?

4 REPLIES 4
Highlighted

Re: Trouble with spark / kafka

@Clément Dumont You also need to pass the keytab in '--files' option so that the jaas conf will use that keytab and connect with kafka.

Highlighted

Re: Trouble with spark / kafka

Thks,

I tried this :

/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --keytab clement.service.keytab --principal clement@MYDOMAIN.FR --files clement_client_jaas.conf,clement.service.keytab --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=clement_client_jaas.conf" --jars spark-streaming-kafka-0-8_2.11-2.2.0.jar,spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-assembly_2.11-1.6.3.jar consumer.py

But i have the same error :

18/11/08 13:26:14 WARN VerifiableProperties: Property security.protocol is not valid
18/11/08 13:26:14 WARN ClientCnxn: SASL configuration failed: javax.security.auth.login.LoginException: No key to store Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.
18/11/08 13:26:14 WARN AppInfo$: Can't read Kafka version from MANIFEST.MF. Possible cause: java.lang.NullPointerException
Highlighted

Re: Trouble with spark / kafka

@Clément Dumont I just tired it, looks like you have one jar from apache which is not recognising the security.protocol property. Below is what i used with the consumer.py you provided. You can download the dependent jars from : http://repo.hortonworks.com/content/repositories/releases/org/apache/spark

/usr/hdp/2.6.4.0-91/spark2/bin/spark-submit --files spark_jaas.conf,kafka.service.keytab --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=spark_jaas.conf" --conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=spark_jaas.conf" --jars spark-streaming_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-0-8-assembly_2.11-2.2.0.2.6.4.0-91.jar,spark-streaming-kafka-0-10_2.11-2.2.0.2.6.4.0-91.jar consumer.py

Hope this helps.

Highlighted

Re: Trouble with spark / kafka

It work ! thanks.

Don't have an account?
Coming from Hortonworks? Activate your account here