Support Questions

Find answers, ask questions, and share your expertise

Submit job spark standalone emit error kerberos when enabling event log to s3.

avatar
Explorer

Hi everyone,

I installed Spark cluster standalone. I want to config send log job to minio. I configured spark-default.conf file as bellow.

 
    spark.eventLog.enabled=false
    spark.eventLog.dir=s3a://spark-logs/
    spark.history.fs.logDirectory=s3a://spark-logs/
    spark.hadoop.fs.s3a.endpoint=http://192.168.182.131:9000
    spark.hadoop.fs.s3a.access.key=admin
    spark.hadoop.fs.s3a.secret.key=admin12345
    spark.hadoop.fs.s3a.path.style.access=true" 
 
When I submit job spark-submit --master spark://spark-master-svc:7077 --conf spark.jars.ivy=/tmp/.ivy2 pi.py 1. I got the error:
5/10/26 12:13:29 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
  File "/opt/bitnami/spark/examples/src/main/python/pi.py", line 32, in <module>
    .getOrCreate()
     ^^^^^^^^^^^^^
  File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 497, in getOrCreate
  File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 515, in getOrCreate
  File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 203, in __init__
  File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 296, in _do_init
  File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 421, in _initialize_context
  File "/opt/bitnami/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1587, in __call__
  File "/opt/bitnami/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
        at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(Unknown Source)
        at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(Unknown Source)
        at java.base/javax.security.auth.login.LoginContext.invoke(Unknown Source)
        at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
        at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
        at java.base/java.security.AccessController.doPrivileged(Unknown Source)
        at java.base/javax.security.auth.login.LoginContext.invokePriv(Unknown Source)
        at java.base/javax.security.auth.login.LoginContext.login(Unknown Source)
        at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
        at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
        at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
        at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
        at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3746)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3736)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3520)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
        at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1831)
        at org.apache.spark.deploy.history.EventLogFileWriter.<init>(EventLogFileWriters.scala:60)
        at org.apache.spark.deploy.history.SingleEventLogFileWriter.<init>(EventLogFileWriters.scala:213)
        at org.apache.spark.deploy.history.EventLogFileWriter$.apply(EventLogFileWriters.scala:181)
        at org.apache.spark.scheduler.EventLoggingListener.<init>(EventLoggingListener.scala:64)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:631)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source)
        at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
        at py4j.Gateway.invoke(Gateway.java:238)
        at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
        at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
        at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
        at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
        at java.base/java.lang.Thread.run(Unknown Source)
1 REPLY 1

avatar
Expert Contributor

Hello @yoonli

Thanks for contacting our Cloudera Community and sharing your question. 

Something I have to mention is that Cloudera does not support standalone Spark cluster, we only work with YARN clusters. 

Anyway, taking a quick look on this issue, I see that you're not mentioning any principal or keytab, when you use Kerberos, it will always try to run kinit, but if there are no principal and keytab, it will fail. 

Something you can try is using the Simple Auth method, you can add these two settings on your spark-defaults.conf: 

spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider
spark.hadoop.security.authentication=simple
hadoop.security.authentication=simple

https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.