Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Lots of history event warning running Spark Streaming job

Lots of history event warning running Spark Streaming job

New Contributor

I'm running a Spark streaming that uses a direct connection to Kafka on HDP 2.3.4

Unlike other jobs, for this one the driver log is filled with constant warnings like this:

6/02/15 16:02:22 WARN history.YarnHistoryService: Discarding event

I haven't see that class (org.apache.spark.deploy.history.yarn.YarnHistoryService in the standard Spark source code, so I wonder if it's an HDP thing.

I'm going to suppress those warnings to avoid filling up logs but I would appreciate any hints at what can be wrong or how to prevent it.

3 REPLIES 3

Re: Lots of history event warning running Spark Streaming job

So this is the Timeline Server used by Spark for logging the application progress. Do you have Timeline server disabled?

From the code:

- If the timeline service is disabled, that is `yarn.timeline-service.enabled` is not +`true`, then the history will not be published: the application will still run.

- Similarly, in a cluster where the timeline service is disabled, the history server +will simply show an empty history, while warning that the history service is disabled.

- In a secure cluster, the user must have the Kerberos credentials to interact +with the timeline server. Being logged in via `kinit` or a keytab should suffice.

https://github.com/apache/spark/pull/5423/files

If you don't want to use or fix timeline server you might be able to disable logging to it by changing this:

spark.yarn.services ( however ambari doesn't like me to completely remove it so there might be a need to remove it using the config settings, the code also says you can just misspell it )

Highlighted

Re: Lots of history event warning running Spark Streaming job

New Contributor

Thanks for your indications.

The timeline service is enabled (I have almost all setttings by default) and working fine for other regular Spark jobs (haven't tried running other streaming jobs in this cluster).

I don't have the cluster kerberized.

Re: Lots of history event warning running Spark Streaming job

Hmmm I don't see these errors in my streaming application.