We are running a production spark streaming job on a Kerberised CDH cluster and some of its actions fails less than 4 days with the following error condition:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token <token_id> for <user>) can't be found in cache
Spark submission script already contains the following settings:
--conf spark.yarn.principal=<principal>
--conf spark.yarn.keytab=<keytab_location_on_disk>
--conf spark.hadoop.fs.hdfs.impl.disable.cache=true
Spark version is 1.6.0-CDH5.11.1
OS is RHEL 6.9
Do anybody have further suggestions about configuration or where to look for futher information about this issue?