Created 10-26-2025 05:21 AM
Hi everyone,
I installed Spark cluster standalone. I want to config send log job to minio. I configured spark-default.conf file as bellow.
spark.eventLog.enabled=false
spark.eventLog.dir=s3a://spark-logs/
spark.history.fs.logDirectory=s3a://spark-logs/
spark.hadoop.fs.s3a.endpoint=http://192.168.182.131:9000
spark.hadoop.fs.s3a.access.key=admin
spark.hadoop.fs.s3a.secret.key=admin12345
spark.hadoop.fs.s3a.path.style.access=true" 5/10/26 12:13:29 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
File "/opt/bitnami/spark/examples/src/main/python/pi.py", line 32, in <module>
.getOrCreate()
^^^^^^^^^^^^^
File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 497, in getOrCreate
File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 515, in getOrCreate
File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 203, in __init__
File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 296, in _do_init
File "/opt/bitnami/spark/python/lib/pyspark.zip/pyspark/context.py", line 421, in _initialize_context
File "/opt/bitnami/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1587, in __call__
File "/opt/bitnami/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
at jdk.security.auth/com.sun.security.auth.UnixPrincipal.<init>(Unknown Source)
at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(Unknown Source)
at java.base/javax.security.auth.login.LoginContext.invoke(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/java.security.AccessController.doPrivileged(Unknown Source)
at java.base/javax.security.auth.login.LoginContext.invokePriv(Unknown Source)
at java.base/javax.security.auth.login.LoginContext.login(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:2065)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1975)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:719)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:669)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:579)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3746)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3736)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3520)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1831)
at org.apache.spark.deploy.history.EventLogFileWriter.<init>(EventLogFileWriters.scala:60)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.<init>(EventLogFileWriters.scala:213)
at org.apache.spark.deploy.history.EventLogFileWriter$.apply(EventLogFileWriters.scala:181)
at org.apache.spark.scheduler.EventLoggingListener.<init>(EventLoggingListener.scala:64)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:631)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Unknown Source)
at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Unknown Source)
Created 11-03-2025 11:07 AM
Hello @yoonli,
Thanks for contacting our Cloudera Community and sharing your question.
Something I have to mention is that Cloudera does not support standalone Spark cluster, we only work with YARN clusters.
Anyway, taking a quick look on this issue, I see that you're not mentioning any principal or keytab, when you use Kerberos, it will always try to run kinit, but if there are no principal and keytab, it will fail.
Something you can try is using the Simple Auth method, you can add these two settings on your spark-defaults.conf:
spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider
spark.hadoop.security.authentication=simple
hadoop.security.authentication=simplehttps://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html