Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this solution

Re: Unable to connect to kerberized Kafka 2.1 (0.10) from Spark 2.1

Contributor

Hi,

 

Thanks for your hint mbigelow.

As far as I know Unix PATH is just a PATH, and starting it by ./ or without has the same meaning - in this case current working directory, which is application hdfs directory.

But I've tried your sugestion, and it inspired me for furter investigation:

isegrim@sparkgw:~/kafka-spark-krb-test$ SPARK_KAFKA_VERSION=0.10 spark2-submit \
--master yarn \
--deploy-mode client \
--queue myqueue \
--keytab /home/isegrim/isegrim.keytab \
--principal isegrim@TEST.COM \
--files /home/isegrim/jaas.conf#jaas.conf \
--driver-java-options "-Djava.security.auth.login.config=jaas.conf" \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=jaas.conf" \
target/scala-2.11/kafka-spark-krb-test.jar

But it gave me the same log+error (without './' in error message) :/

Caused by: java.io.IOException: jaas.conf (No such file or directory)

 

I've tried hdfs path to jaas.conf but it also failed:

hdfs dfs -put jaas.conf

SPARK_KAFKA_VERSION=0.10 spark2-submit \
--master yarn \
--deploy-mode client \
--queue myqueue \
--keytab /home/isegrim/isegrim.keytab \
--principal isegrim@TEST.COM \
--files /home/isegrim/jaas.conf#jaas.conf \
--driver-java-options "-Djava.security.auth.login.config=hdfs:///user/isegrim/jaas.conf" \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=hdfs:///user/isegrim/jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=hdfs:///user/isegrim/jaas.conf" \
target/scala-2.11/kafka-spark-krb-test.jar

Error:

Caused by: java.io.IOException: hdfs:///user/isegrim/jaas.conf (No such file or directory)

But I've put the file into each worker's temp and now it looks like executors read it from local operating system storage, rather then from hdfs application CWD (--files flag):

for i in {1..4}; do
    scp jaas.conf worker${i}:/tmp/jaas.conf
done

SPARK_KAFKA_VERSION=0.10 spark2-submit \
--master yarn \
--deploy-mode client \
--queue myqueue \
--keytab /home/isegrim/isegrim.keytab \
--principal isegrim@TEST.COM \
--files /home/isegrim/jaas.conf#jaas.conf \
--driver-java-options "-Djava.security.auth.login.config=/tmp/jaas.conf" \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/tmp/jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/tmp/jaas.conf" \
target/scala-2.11/kafka-spark-krb-test.jar

This is the new error:

...
17/06/17 00:06:28 ERROR streaming.StreamingContext: Error starting the context, marking it as stopped
org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
        at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:702)
...
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner  authentication information from the user
        at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86)
...

 

 

 

So I've distributed the keytab to every worker node's operating system's local filesystem and I've change the path from app HDFS CWD to OS local filesystem and it works!

$ cat jaas.conf
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/home/isegrim/isegrim.keytab"
principal="isegrim@TEST.COM";
};

Distribute new jaas.conf and keytab:

$ for i in {1..4}; do
    scp isegrim.keytab worker${i}:/home/isegrim/isegrim.keytab
done
$ for i in {1..4}; do
    scp jaas.conf worker${i}:/tmp/jaas.conf
done

Run the app and check logs:

SPARK_KAFKA_VERSION=0.10 spark2-submit \
--master yarn \
--deploy-mode client \
--queue myqueue \
--keytab /home/isegrim/isegrim.keytab \
--principal isegrim@TEST.COM \
--files /home/isegrim/jaas.conf#jaas.conf \
--driver-java-options "-Djava.security.auth.login.config=/tmp/jaas.conf" \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/tmp/jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/tmp/jaas.conf" \
target/scala-2.11/kafka-spark-krb-test.jar

...
17/06/17 00:23:17 INFO streaming.StreamingContext: StreamingContext started
...
-------------------------------------------
Time: 1497651840000 ms
-------------------------------------------
58

Thank you mbigelow for inspiration!

View solution in original post

Who agreed with this solution