Created 03-11-2024 06:08 AM
Hello everyone. I need some help.
I am currently working on a Proof of Concept (POC) for building a platform for a client, which is why I am unable to open a case and am asking my question here in this community.
We are using spark3 and kudu, and there seems to be a problem. MIT Kerberos is set up, and I'm working in a CM7.10.1, CDP7.1.7sp2 environment.
It's a Spark On YARN environment, and I'm executing commands like this:
spark-submit --master yarn --deploy-mode cluster \--keytab /etc/security/keytabs/user1.keytab \ --principal user1@example.com \ --jars /opt/cloudera/parcels/CDH/lib/kudu/kudu-spark3_2.12.jar \....
The spark application runs normally on YARN and operates, but an error occurs when the Kerberos ticket is renewed after 7 days. Since the spark application restarts once by default, I think it will be completely killed after 14 days, but for now, it's still running after restarting.
org.apache.kudu.client.NonRecoverableException: cannot re-acquire authentication token after 5 attempts (couldn't find a valid master in (m01.example.com:7051, m02.example.com:7051, m03.example.com:7051) exception
Unauthorized connection attempt: Server Connection negotiation failed: Server connection from 123.123.123.123:42026: token expired.
Failed RPC negotiation trace: {
.....
Negotiation complete: network error: Server connection negotiation failed: Server connection from 123.123.123.123:42026: BlockingRecv error: recv of EOF from 123.123.123.123:42026 (error 108)
I don't see any warnings or issues in the krb5kdc.log. It seems like the related error was resolved in Apache Kudu version 1.5, but I am using CDP7.1.7sp2, Apache kudu 1.15 version. Please help me. Thank you in advance.
Created 03-11-2024 01:04 PM
@yjbyun Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Spark experts @Bharati @jagadeesan who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 04-02-2025 03:54 AM
Hello,
we have exactly the same issue with our structured streaming jobs. We are trying to consume data from Kafka and write it to the Kudu, but when the job exceed 7 days we are getting same errors. Did someone find solution for this issue?
Created 04-07-2025 01:32 PM
@Bharati @jagadeesan @smdas @Gopinath Hi! Do youhave some insights here? Thanks!
Regards,
Diana Torres,Created 04-08-2025 10:25 AM
This looks to be due to a known issue we have in Kudu "KUDU-2679" which is unresolved, the best option available is to increase the length of time during which the auth tokens are valid. There is a property for this, and I recommend increasing its value from the default of 7 days up to 90 days (which you can later reduce to the expected timeframe of these jobs running if needed). You will also need to unlock experimental properties if you have not already. CM > Kudu > Configuration > Kudu Master Advanced Configuration Snippet (Safety Valve) for gflagfile:
--unlock_experimental_flags
--authn_token_validity_seconds=7776000
Created 04-16-2025 03:49 AM
Thanks for sharing this workaround @rsanchez . In the docs we found that is not recommended to use experimental flags, but at this moment seems no other solution is available.
I am wondering why there isn't possibility to use JAAS configuration to authenticate against Kudu with Spark, as it is with Java client for example.