Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Actions of long running Spark streaming job fails with "HDFS_DELEGATION_TOKEN token can't be found in cache"

Highlighted

Actions of long running Spark streaming job fails with "HDFS_DELEGATION_TOKEN token can't be found in cache"

New Contributor

We are running a production spark streaming job on a Kerberised CDH cluster and some of its actions fails less than 4 days with the following error condition:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token <token_id> for <user>) can't be found in cache

 

Spark submission script already contains the following settings:

--conf spark.yarn.principal=<principal>
--conf spark.yarn.keytab=<keytab_location_on_disk>
--conf spark.hadoop.fs.hdfs.impl.disable.cache=true

 

Spark version is 1.6.0-CDH5.11.1

OS is RHEL 6.9

 

Do anybody have further suggestions about configuration or where to look for futher information about this issue?

 

1 REPLY 1

Re: Actions of long running Spark streaming job fails with "HDFS_DELEGATION_TOKEN token can't be found in cache"

Guru
@IsmailKeskin ,

--conf spark.hadoop.fs.hdfs.impl.disable.cache=true should help, if not, please also try to add
--conf mapreduce.job.complete.cancel.delegation.tokens=false

See if it helps.

Cheers
Eric
Don't have an account?
Coming from Hortonworks? Activate your account here